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GLYCOSIDASE ENZYMES 

This application is a continuation-in-part of pending 
patent application 08/583,787 filed January 11. 1996. 

This invention relates to newly identified 
polynucleotides. polypeptides encoded by such 
polynucleotides, the use of such polynucleotides and 
polypeptides , as well as the production and isolation of such 
polynucleotides and polypeptides. More particularly, the 
polynucleotides and polypeptides of the present invention has 
been putatively identified as glucosidases , a-galactosidases, 
^-galactosidases, 6-mannosidases , S-mannanases , 
endoglucanases, and pullalanases . 

The glycosidic bond of ^-galactosides can be cleaved by 
different classes of enzymes: (i) phoepho-^-galactosidaBes 
(EC3.2.1.85) are specific for a phosphorylated substrate 
generated via phosphoenolpyruvate phosphotransferase system 
(PTS) -dependent uptake; (ii) typical jS-galactosidases (EC 
3.2.1.23). represented by the EBcherichia coli LacZ enzyme, 
which are relatively specific for /S-galactosides; and (iii) 
/3 -glucosidases (EC 3.2.1.21) such as the enzymes of 
Agrobacterium faecalie, ClOBCridlvm tzhermocelluai, PyrococcuG 
furioeuB or Sulfolobus solfataricus (Day, A.G. and Withers, 
S.G., (1986) Purification and characterization of a ^- 
glucosidase from Alcaligenes faecalis. Can. J. Biochem. Cell. 
Biol. €4. 914-922; Kengen. S.W.M.. et al . (1993) Eur. J. 
Biochem., 213, 305-312; Ait, N. , Cruezet, N. and Cattaneo, J. 
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(1982) Properties of /?-glucoBidase purified from CloBtridium 
thermocellum. J. Gen. Microbiol. 128, 569-577; Grogan, D.W. 
(1991) Evidence that /3-galactosidase of Sulfolobus 
Golfata^ricue is only one of several activities of a 
thermostable jB-D-glycodiase . Appl. Environ. Microbiol. 57, 
1644-1649). Members of the latter group, although highly 
specific with respect to the /S-anomeric configuration of the 
glycosidic linkage, often display a rather relaxed substrate 
specificity and hydrolyse /3-glucosides as well as iS-fucosides 
cind /3-galactosides . 

Generally, a-galactosidases are enzymes that catalyze 
the hydrolysis of galactose groups on a polysaccaride 
backbone or hydrolyze the cleavage of di- or oligosaccharides 
comprising galactose . 

Generally, E-mannanases are enzymes that catalyze the 
hydrolysis of maiuiose groups internally on a polysaccaride 
backbone or hydrolyze the cleavage of di- or 
oligosaccaharides comprising mannose groups. S-mannosidases 
hydrolyze non-reducing, terminal mannose residues on a 
mannose -containing polysaccharide and the cleavage of di- or 
oligosaccaharides comprising mannose groups . 

Guar gum is a branched galactomannan polysaccharide 
composed of /3-l,4 linked mannose backbone with a-1,6 linked 
galactose sidechains . The enzymes required for the 

degradation of guar are ff-mannanase , ^-mannosidase and of- 
galactosidase . jS-mannanase hydrolyses the mannose backbone 
internally and )S-mannosidase hydrolyses non-reducing, 
terminal mannose residues. or-galactosidase hydrolyses of- 
linked galactose groups . 

Galactomannan polysaccharides and the enzymes that 
degrade them have a variety of applications. Guar is 
commonly used as a thickening agent in food and is utilized 
in hydraulic fracturing in oil and gas recovery - 
Consequently, galactomannanases are industrially relevsmt for 
the degradation and modification of guar. Furthermore, a 
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need exists for thermostable galactomannaBes that are active 
in extreme conditions associated with drilling and well 
stimulation. 

There are other applications for these enzymes in 
various industries, such as in the beet sugar industry- 20- 
3 0% of the domestic U.S. sucrose consumption is sucrose from 
sugar beets . Raw beet sugar can contain a small amount of 
raf finose when the sugar beets are stored before processing 
and rotting begins to set in. Raf finose inhibits the 
crystallization of sucrose and also constitutes a hidden 
quantity of sucrose. Thus, there is merit to eliminating 
raf finose from raw beet sugar. a-Galactosidase has also been 
used as a digestive aid to break down raf finose, stachyose, 
and verbascose in such foods as beans and other gassy foods. 

i3-Galactosidases which are active and stable at high 
temperatures appear to be superior enzymes for the production 
of lactose -free dietary milk products (Chaplin, M.F. and 
Bucke, C. (1990) In: Enzyme Technology, pp. 159-160, 
Cambridge University Press , Cambridge, UK) . Also, several 
studies have demonstrated the applicability of ^- 
galactosidases to the enzymatic synthesis of oligosaccharides 
via transglycosylation reactions (Nilsson, K.G.I. (1988) 
Enzymatic synthesis of oligosaccharides. Trends Biotechnol. 

6, 156-264; Cote, G.L. and Tao, B.Y. (1990) Oligosaccharide 
synthesis by enzymatic transglycosylation. Glycoconjugate J. 

7, 145-162) . Despite the commercial potential, only a few 0- 
galactosidases of thermophiles have been characterized so 
far. Two genes reported are ^-galactoside- cleaving enzymes 
of the hyperthermophilic bacterium Thexinoto^a marxtlma., one 
of the most thermophilic organotrophic eubacteria described 
to date (Huber, R. , Langworthy, T.A., Konig, H., Thomm, M. , 
Woese, C.R., Sleytr, U.B. and Stetter, K.O. (1986) T. jmartima 
sp. nov- represents a new genus of unique extremely 
thermophilic eubacteria growing up to 90^C, Arch. Microbiol. 
144, 324-333) one of the most thermophilic organotrophic 
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eubacteria deecribed to date. The gene products have been 
identified as a ^-galactosidaee and a ^-glucosidaee . 

Pullulanase is well known as a debranching enzyme of 
pullulan and starch. The enzyme hydrolyzes a-l , 6 -glucosidic 
linkages on these polymers. Starch degradation for th 
eproduction or sweeteners (glucose or maltose) is a very 
important industrial application of this enzyme. The 
degradation of starch is developed in two stages. The first 
stage involves the liquefaction of the substrate with cr- 
amylase, and the second stage, or saccharification stage, is 
performed by fi-amylase with pullalanase added as a 
debranching enzyme, to obtain better yields. 

Endoglucanases can be used in a variety of industrial 
applications. For instance, the endoglucanases of the 
present invention can hydrolyze the internal £-1 , 4-glycosidic 
bonds in cellulose, which may be used for the conversion of 
plant biomass into fuels and chemicals. Endoglucanases also 
have applications in detergent formulations, the textile 
industry, in animal feed, in waste treatment, and in the 
fruit juice and brewing industry for th eclarif ication and 
extraction of juices. 

The polynucleotides and polypeptides of the present 
invention have been identified as glucosidases , a- 
galactosidasee , ^-galactosidases , S-maumosidases , e- 
mannanases, endoglucanases, and pullalanases as a result of 
their enzymatic activity. 

In accordance with one aspect of the present invention, 
there are provided novel enzymes, as well as active 
fragments, analogs and derivatives thereof. 

In accordance with smother aspect of the present 
invention, there are provided isolated nucleic acid molecules 
encoding the enzymes of the present invention including 
mRNAs, cDNAs, genomic DNAs as well as active analogs and 
fragments of such enzymes . 



wo 97/25417 



PCTAJS97/00092 



In accordance with another aspect of the present 
invention there are provided isolated nucleic acid molecules 
encoding mature polypeptides expressed by the DNA contained 
in ATCC Deposit No. 97379. 

In accordance with yet a further aspect of the present 
invention, there is provided a process for producing such 
polypeptides by recombinant techniques comprising culturing 
recombinant prokaryotic and/or eukaryotic host cells, 
containing a nucleic acid sequence of the present invention, 
under conditions promoting expression of said enzymes and 
subsequent recovery of said enzymes. 

In accordance with yet a further aspect of the present 
invention, there is provided a process for utilizing such 
enzymes, or polynucleotides encoding such enzymes for 
hydrolyzing lactose to galactose and glucose for use in the 
food processing industary, the pharmaceutical industry, for 
example, to treat intolerance to lactose, as a diagnostic 
reporter molecule, in com wet milling, in the fruit juice 
industry, in baking, in the textile industry and in the 
detergent industry. 

In accordance with yet a further aspect of the present 
invention, there is provided a process for utilizing such 
enzymes for hydrolyzing guar gum (a galactomannan 
polysaccharide) to remove non- reducing terminal raannose 
residues . Further polysaccharides such as galactomannan and 
the enzymes according to the invention that degrade them have 
a varitey of applications. Guar gum is commonly used as a 
thickening agent in food and also is utilized in hydraulic 
fracturing in oil and gas recovery. Consequently, mannanases 
are industrially relevant for the degradation and 
modification of guar gums. Furthermore, a need exists for 
thermostable mannases that are active in extreme conditions 
associated with drilling and well stimulation. 

In accordance with yet a further aspect of the present 
invention, there are also provided nucleic acid probes 
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comprising nucleic acid molecules of sufficient length to 
specifically hybridize to a nucleic acid sequence of the 
present invention. 

In accordance with yet a further aBf>ect of the present 
invention, there is provided a process for utilizing such 
enzymes , or polynucleotides encoding such enzymes . for in 
vitro purposes related to scientific research, for example, 
to generate probes for identifying similar sequences which 
might encode similar enzymes from other organisms by using 
certain regions, i.e*, conserved sequence regions, of the 
nucleotide sequence. 

These and other aspects of the present invention should 
be apparent to those skilled in the art from the teachings 
herein. 

Brief Description of the Drawings 

The following drawings are illustrative of embodiments 
of the invention and are not meant to limit the scope of the 
invention as encoir^jassed by the claims. 

Figure 1 is an illustration of the full-length DNA and 
corresponding deduced ami no acid sequence of M11T1» of the 
present invention. Sequencing was performed using a 378 
automated DNA. sequencer for all sequences of the present 
invention (Applied Biosystems, Inc. ) • 

Figure 2 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of OC1/4V-33B/G - 

Figure 3 is an. illustration of the full-length DNA and 
corresponding deduced ataino acid sequence of F1-12G. 

Figure 4 are illustrations of the full-length DNA and 
corresponding deduced amino acid secfuence of 9N2-31B/G. 

Figure 5 are illustrations of the full-length DNA and 
corresponding deduced amino acid sequence of MSB8-6G. 

Figure 6 are illustrations of the full-length DNA cuid 
corresponding deduced amino acid sequence of AEDII12RA-1BB/G. 

Figure 7 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of GC74-22G. 
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Figure 8 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of VC1-7G1. 

Figure 9 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 37GP1 . 

Figure 10 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 6GC2 . 

Figure 11 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 6GP2 . 

Figure 12 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 63GB1. 

Figure 13 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of OC1/4V. 

Figure 14 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 6GP3 . 

Definitions 

The term " gene " means the segment of DNA involved in 
producing a polypeptide chain; it includes regions preceding 
and following the coding region (leader and trailer) as well 
as intervening sequences (introns) between individual coding 
segments (exons) , 

A coding sequence is "operably linked to" another coding 
sec[uence when KNA polymerase will transcribe the two coding 
sequences into a single mRNA, which is then translated into 
a single polypeptide having amino acids derived from both 
coding sequences . The coding secjuences need not be 

contiguous to one another so long as the expressed sequences 
ultimately process to produce the desired protein. 

^'Recombinant " enzymes refer to enzymes produced by 
recombinant DNA techniques; i.e., produced from cells 
transformed by an exogenous DNA construct encoding the 
desired enzyme. "Synthetic" enzymes are those prepared by 
chemical synthesis • 

A DNA "coding sequence of" or a »*nucleotide sequence 
encoding" a particular enzyme, is a DNA sequence which is 
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traiXBcribed and t:r*anslat:ed into an enzyme when placed under 
the control of appropriate regulatory eequencee . 

Summary of the Invent ipn 

In accordance with an aspect of the present invention, 
there are provided isolated nucleic acids (polynucleotides) 
which encode for the mature enzymes having the deduced amino 
acid sequences of Figures 1-14 (SEQ ID NOS: 15-28) . 

In accordance with another aspect of the present 
invention, there are provided isolated polynucleotides 
encoding the enzymes of the present invention. The deposited 
material is a mixture of genomic clones con^rising DNA 
encoding an enzyme of the present invention. Each genomic 
clone comprising the respective DKA. has been inserted into a 
pBluescript vector (Stratagene, La Jolla, CA) . The deposit 
has been deposited with the American Type Culture Collection, 
12301 Parklawn Drive r Roc}cville, Maryland 20852, USA, on 
December 13, 1995 and assigned ATCC Deposit No. 97379. 

The deposit (s) have been made under the terms of the 
Budapest Treaty on the International Recognition of the 
deposit of micro-organisms for piirposes of patent procedure. 
The strains will be irrevocably and without restriction or 
condition released to the pxiblic upon the issuance of a 
patent. These deposits are provided merely as convenience to 
those of skill in the art and are not an admission that a 
deposit be required under 35 U.S.C. §112. The sequences of 
the polynucleotides contained in the deposited materials, as 
well as the amino acid sequences of the polypeptides encoded 
thereby, are controlling in the event of any conflict with 
any description of sequences herein. A license may be 
required to make, use or sell the deposited materials, and no 
such license is hereby granted. 

Detailed Description of the Invention 

The polynucleotides of this invention were originally 
recovered from genomic gene libraries derived from the 
following organisms : 

-8- 
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MllTL. is a new species of DeeulfurococcuB isolated from 
Diamond Pool in Yellowetone National Piark. The organism 
grows optimally at 85-88»C, pH 7.0 in a low salt medium 
containing yeast extract, peptone, and gelatin as stibstrates 
with a Nj/COj gas phase. 

0C1/4V is from the genus rhermotoga. The organism was 
isolated from Yellowstone National Park. It grows optimally 
at 75°C in a low salt medium with cellulose as a substrate 
and in gas phase. 

PyrococcuB furxoBUB VCl is from the genus PyrococcuB. 
VCl was isolated from Vulcano, Italy. It grows optimally at 
100*C in a high salt medium (marine) containing elemental 
sulfur, yeast extract, peptone and starch as substrates and 

Nj in gas phase. 

Staphylothexmus marinus Fl is a from the genus 
szaphylotlxsrmus. Fl was isolated from Vulcano, Italy. It 
grows optimally at 85»C, pH 6.5 in high salt medium (marine) 
containing elemental sulfur and yeast extract as substrates 

and in gas phase. 

Thermococcus 9N-2 is from the genus ThermococcxiB 9N-2 
was isolated from diffuse vent fluid in the East Pacific 
Rise. It is a strict anaerobe that grows optimally at 87'C. 

Thermolzoga maritime MSB8 is from the genus Thermotogo. 
and was isolated from Vulcano, Italy. MSB8 grows optimally 
at 85" C, pH 6.5 in a high salt medium (marine) containing 
starch and yeast extract as substrates and in gas phase. 

ThermacoccuB Alcaliphilus AEDII12RA is from the genus 
ThermococcuB . AEDII12RA grows optimally at BB^C, pH 9.5 in 
a high salt medium (marine) containing polysulfides and yeast 
extract as substrates and Nj in gas phase. 

Theraiococcus chi tonophague GC74 is from the genus 
Thennococcus. GC74 grows optimally at BB^C. pH 6 . 0 in a high 
salt medium (marine) containing chitin, meat extract, 
elemental sulfur and yeast extract as substrates and N, in gas 
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phase. AEPII la grows optimally at 85«C at pH 6.5 in marine 
medium iinder anaerobic conditions. It has many substrates. 
[Add descriptions of new organisms] 

Accordingly, the polynucleotides and enzymes encoded 
thereby are identified by the organism from which they were 
isolated, and are sometimes hereinafter referred to as 
"MllTL" (Figure 1 and SEQ ID NOS:l and 15), "OC1/4V-33B/G" 
(Figure 2 and SEQ ID NOS;2 and 16), "F1-12G" (Figure 3 and 
SEQ ID NOS:3 and 17), «9N2-31B/G" (Figure 4 and SEQ ID NOS:4 
and 18), "MSB8" (Figure 5 and SEQ ID NOS:5 and 19), 
"AEDII12RA-18B/G" (Figure 6 and SEQ ID NOS:6 and 20) , "GC74- 
22G" (Figure 7 and SEQ ID NOS:7 and 21), "VC1-7G1" (Figure 8 
and SEQ ID NOSrS and 22), "37GP1" (Figure 9 and SEQ ID NOS : 
9 and 23), "6GC2" (Figure 10 and SEQ ID NOS: 10 and 24), 
"6GP2" (Figure 11 and SEQ ID NOS:ll and 25), "AEPII la" 
(Figure 12 and SEQ ID NOS:12 and 26) , "OC1/4V" (Figure 13 and 
SEQ ID NOS: 13 and 27), and "6GP3" (Figure 14 and SEQ ID 
NOS: 28) . 

The polynucleotides and polypeptides of the present 
invention show identity at the nucleotide and protein level 
to known genes and proteins encoded thereby as shown in Table 
1- 



•.vcione ^'':>v - ■;' 


";^;G^eJ^PTO 
•■:^^ciosestt:H6m^ 


■)?aaent:i£y':'"^ 


Identity 


M11TL-29G 


Sulf olobus 
sulfataricus DSM 
1616/Pl, 0- 
galactosidase 


51% 


55% 


OC1/4V-33B/G 


Caldocellum 
saccharolyticum , 
/S-glucosidase 


52% 


57% 


S taphylothermue 
msLxrinuE F1-12G 


Bacillus polymyxa, 
0-galactosidase 


36% 


48% 



-10- 



wo 97/25417 



PCT/US97/00092 



ThermococcuB 
9N2-31B/G 


SulfolobuB 

Bulf atari cue ATCC 

49255/MT4 /?- 

galactosidaee 


51% 


50% 


meiritima. MSB8- 


Clostzridium 
thermocellum bglB 


45% 


53% 


Thermococcue 
AEDI I 12RA- 1 8B/G 


Bacillus polymyxa, 
jS -galactOBidase 


34% 


48% 


TJ2 ermoco ecu b 
chi tonophsiguB 
GC74-22G 


Sulfolobus 
sulfataricus ATCC 
49255/MT4, ^- 
galactosidaee 


46% 


54% 


Pyrococaus 
furiosus VCl- 
7G1 


Sulfolobus 

sulf ataricus/MT-4 

)3 -galactosidaee 


46.4% 


52.5% 


Thermo to^a 
jnaxitiiua a- 
galactosidase 
{6GC2) 


Pediococcus 
pentosaceaus cr- 
galactosidaee 


49% 


29% 


Thermo tog-a 
maritlma. B- 
maimanase 
(6GP2) 


Aape rgi 1 lus 

aculeatUB 

maxmanaee 


56% 


37% 


AEPII la B- 
mannosidase 
(63GB1) 


Sulfolobus 

solf actaricus fi- 

galactosidaee 


78% 


56% 


OC1/4V 

endogluccinase 
(33GP1) 


Clostridium 
thermocellum endo- 
1,4-lS- 

endoglucanase 


65% 


43% 


metrx tlma. 
pullalanase 
(6GP3) 


Caldocellum 
eaccharolyticum a- 
destrom 6 
glucanohydralase 


72 


53 


Bankia ffouldx 
mix 

Bndogluccuiaee 
{37GP1) 


None available 







The polynucleotides and enzymes of the present invention 
show homology to each other as shown in Table 2 . 
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Table 2 





Gene / Pro te in rwttK; : ; 
' ■' 'Cloa e B t * " Hotnoloov' 


iiProtein 
■videntity 


Nucleic 

Acid 
Identity 


S taphyl o thermu g 
mairlnuB F1-12G 


Thermococcus 
AEDII12RA-18B/G, 
0-galactosidase , 
glucosidase 


55% 


57% 


ThGrmococcus 
9N2-31B/G 


ThGzmococcus 
chi tonophagrus 
GC74-22G- 
glucoeidase * 


74% 


€€% 


PyrococcuB 
fxirloBUG VCl- 
7G1 


Pyrococcus 
furiosue VC1-7B/G 
)3 -galactos idase 


46.4% 


54% 



All the clones identified in Tables 1 and 2 encode 
polypeptides which have of-glycosidase or ^-glycosidase 
activity. 

This invention, in addition to the isolated nucleic acid 
molecules encoding the enzymes of the present invention, also 
provide substantially similar sequences. Isolated nucleic 
acid sequences are substantially similar if: (i) they are 
capable of hybridizing under conditions hereinafter 
described, to the polynucleotides of SEQ ID NOS:l-8; (ii) or 
they encode DNA sequences which are degenerate to the 
polynucleotides of SEQ ID NOS:l-e. Degenerate DNA sequences 
encode the amino acid sequences of SEQ ID NOS:9-16, but have 
variations in the nucleotide coding sequences. As used 
herein, substantially similar refers to the sequences having 
similar identity to the sequences of the instant invention. 
The nucleotide sequences that are substantially the same can 
be identified by hybridization or by sequence comparison. 
Enzyme sequences that are substantially the same can be 
identified by one or more of the following: proteolytic 
digestion, gel electrophoresis and/or microsequencing . 
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One means for isolating the nucleic acid molecules 
encoding the enzymes of the present invention is to probe a 
gene library with a natural or artificially designed probe 
using art recognized procedures (see, for example: Current 
Protocols in Molecular Biology, Ausubel F.M. et al . (EDS.) 
Green Publishing Company Assoc. and John Wiley Interscience, 
New York, 19 89, 1992) . It is appreciated to one skilled in 
the art that the polynucleotides of SEQ ID NOS:l-14 or 
fragments thereof (con^rising at least 12 contiguous 
nucleotides), are particularly useful probes. Other 
particular useful probes for this purpose are hybridizable 
fragments to the sequences of SEQ ID NOS:l-14 (i.e., 
comprising at least 12 contiguous nucleotides) . 

With respect to nucleic acid sequences which hybridize 
to specific nucleic acid sequences disclosed herein, 
hybridization may be carried out ujider conditions of reduced 
stringency, medium stringency or even stringent conditions. 
As an example of oligonucleotide hybridization, a polymer 
membrane containing immobilized denatured nucleic acids is 
first prehybridized for 30 minutes at 45«>C in a solution 
consisting of 0.9 M NaCl, 50 mM NaH2P04, pH 7.0, 5.0 mM 
Na2EDTA, 0.5% SDS, XOX Denhardt's, and 0.5 mg/mL 
polyriboadenylic acid. Approximately 2 X lO'' cpm (specific 
activity 4-9 X 10* cpm/ug) of ^^P end-labeled oligonucleotide 
probe are then added to the solution. After 12-16 hours of 
incubation, the membrane is washed for 3 0 minutes at room 
temperature in IX SET (150 mM NaCl, 20 mM Tris hydrochloride, 
pH 7.8, 1 mM NajEDTA) containing 0.5% SDS, followed by a 30 
minute wash in fresh IX SET at Tm 10*»C for the oligo- 
nucleotide probe. The membreuie is then exposed to auto- 
radiographic film for detection of hybridization signals. 

Stringent conditions means hybridization will occur only 
if there is at least 90% identity, preferably at least 95% 
identity and most preferably at least 97% identity between 
the sequences. Further, it is understood that a section of 
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a 100 bps sequence that is 95 bps in length has 95% identity 
with the 1090 bps sequence from which it is obtained. See J. 
Sambrook et al . , Molecular Cloning , A LaJ^oracory Manual, 2d 
Ed., Cold Spring Harbor Laboratory (1989) which is hereby 
incorporated by reference in its entirety. Also, it is 
understood that a fragment of a 100 bps sequence that is 95 
bps in length has 95% identity with the 100 bps sequence from 
which it is obtained. 

As used herein, a first DNA (RNA) sequence is at least 
70% and preferably at least 80% identical to another DNA 
(RNA) sequence if there is at least 70% and preferably at 
least a 80% or 90% identity, respectively, between the bases 
of the first sequence and the bases of the another sequence, 
when properly aligned with each other, for example when 
aligned by BLASTN. 

"Identity" as the terro is used herein, refers to a 
polynucleotide sequence which comprises a percentage of the 
same bases as a reference polynucleotide (SEQ ID NOS:l-8) . 
For exanqple, a polynucleotide which is at least 90% identical 
to a reference polynucleotide, has polynucleotide bases which 
are identical in 90% of the bases which make up the reference 
polynucleotide and may have different bases in 10% of the 
bases which comprise that polynucleotide secjuence. 

The present invention relates polynucleotides which 
differ from the reference polynucleotide such that the 
changes are silent changes, for example the change do not 
^Iter the amino acid sequence encoded by the polynucleotide. 
The present invention also relates to nucleotide changes 
which result in amino acid substitutions, additions, 
deletions, fusions and truncations in the polypeptide encoded 
by the reference polynucleotide. In a preferred aspect of 
the invention these polypeptides retain the same biological 
action as the polypeptide encoded by the reference 
polynucleotide. 
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It is also appreciated that euch probes can be and are 
preferably labeled with an analytically detectable reagent to 
facilitate identification of the probe. Useful reagents 
include but are not limited to radioactivity, fluorescent 
dyes or enzymes capable of catalyzing the formation of a 
detectable product. The probes are thus useful to isolate 
complementary copies of DNA from other sources or to screen 
such sources for related sequences. 

The polynucleotides of this invention were recovered 
from genomic gene libraries from the organisms listed in 
Table 1, For example, gene libraries can be generated in the 
Lambda ZAP II cloning vector (Stratagene Cloning Systems) . 
Mass excisions can be performed on these libraries to 
generate libraries in the pBluescript phagemid. Libraries 
are thus generated and excisions performed according to the 
protocols/methods hereinafter described. 

The excision libraries are introduced into the E. colx 
strain BW14e93 F'kanlA. Expression clones are then 

identified using a high temperature filter assay. Expression 
clones encoding several glucanases and several other 
glycosidases are identified and repurified. The 
polynucleotides, and enzymes encoded thereby, of the present 
invention, yield the activities as described above. 

The coding secjuences for the enzymes of the present 
invention were identified by screening the genomic DMAs 
prepared for the clones having glucosidase or galactosidase 
activity. 

An example of such an assay is a high temperatxire filter 
assay wherein expression clones were identified by use of 
high temperature filter assays using buffer Z (see recipe 
below) containing 1 mg/ml of the substrate 5-bromo-4-chloro- 
3-indolyl-iS-D-glucopyranoside (XGLU) (Diagnostic Chemicals 
Limited or Sigma) after introducing an excision library into 
the E. coli strain BW14893 F'kanlA. Expression clones 
encoding XGLUases were identified and repurified from MllTL, 
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OC1/4V, Pyrococcus furiosus VCl, Staphylotihemus marinus Fl, 
Thermococcua 9N-2, Thermotoga maritima MSBB, Thermococcus 
alcaliphiluB AEDII12RA, and Thermococcus chitonophagus GC74 . 

Z-buf f er: (referenced in Miller, J.H. (1992) A Short 
Course in Bacterial Genetics, p. 445.) 

per liter: 

Na2HP04-7H20 IG.lg 
NaH2P04-7HjO 5.5g 
KCl 0.75g 
MgS04-7H20 0.246g 
/3-tnercaptoethanol 2 . 7nil 

Adjust pH to 7.0 

High Temperature Filter Assay 

(1) The f factor f'kan (from JET. coli strain CSH118) (1) was 
introduced into the pho-pnh- lac- strain BW14893(2)- 
BW13 893 (2) . The filamentous phage library was plated on 
the resulting strain, BW14B93 F'kan. {Miller, J.H. 
(1992) A Short Course in Bacterial Genetics; Lee, K.S., 
Metcalf, et al., (1992) Evidence for two phosphonate 
degradative pathways in Enterobacter Aerogenes, J. 
Bacterid,, 174:2501-2510. 

(2) After growth on 100 mm LB plates containing 100 fig/ml^ 
ampicillin, 80 fig/ml nethicillin and ImM IPTG, colony 
lifts were performed using Millipore HATF membrane 
filters, 

(3) The colonies transferred to the filters were lysed with 
chloroform vapor in 150 mm glass petri dishes. 

(4) The filters were treunsferred to 100 mm glass petri 
dishes containing a piece of Whatman 3MM filter paper 
saturated with buffer, 

(a) when testing for galactosidase activity 
(XGALase) , 3MM paper was saturated with Z buffer 
containing 1 mg/ml XGAL (ChemBridge Corporation) , 
After transferring filter bearing lysed colonies to 
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the glass petri dish, placed dish in oven at BO- 
85*C. 

(b) when testing for glucosidase (XGLUase) , 3MM 
paper was saturated with Z buffer containing 1 
mg/ml XGLU. After transferring filter bearing 
lysed colonies to the glass petri dish, placed dish 
in oven at BO-SS^C. 
(5) 'Positives' were observed as blue spots on the filter 
membranes . Used the following filter rescue technique 
to retrieve plasmid from lysed positive colony. Used 
pasteur pipette (or glass capillary tube) to core blue 
spots on the filter meitibrzme. Placed the small filter 
disk in an Eppendorf tube containing, 20 fil water. 
Incubated the Eppendorf tube at 75 'C for 5 minutes 
followed by vortexing to elute plasmid DNA off filter. 
This DNA was transformed into electrocompetent E. coli 
cells DHIOB for Thermatoga maritima MSB8-6G, 
Staphylotherrous marinus F1-12G, Thermococcus AEDII12RA- 
18B/G, Thermococcus chitonophagus GC74-22G, KllTl and 
OC1/4V. Electrocompetent BW14 893 F'kanlA i:. coli were 
used for Thermococcus 9N2-31B/G, and Pyrococcus furioBue 
VC1-7G1. Repeated filter-lift assay on transformation 
plates to identify 'positives'. Return transformation 
plates to 37 *C incubator after filter lift to regenerate 
colonies. Inoculate 3 ml LB liquid containing 100 fig/ml 
ampicillin with repurified positives and incubate at 
37»C overnight. Isolate plasmid DNA from these cultures 
and sequence plasmid insert. In some instances where 
the plates used for the initial colony lifts contained 
non-confluent colonies, a specific colony corresponding 
to a blue spot on the filter could be identified on a 
regenerated plate and repurified directly, instead of 
using the filter rescue technique. 

Another example of such an assay is a variation of the 
high temperature filter assay wherein colony- laden filters 
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are heat-killed at different temperatures (for example, 105®C 
for 20 minutes) to monitor thermostability The 3MM paper is 
saturated with different buffers {i.e., 100 mM NaCl, 5 mM 
MgClj, 100 mM Tris-Cl (pH 9.5)) to determine enzyme activity 
under different buffer conditions. 

A ^-glucosidase assay may also be employed, wherein 
GlcpiSNp is used as an artificial substrate (aryl-^- 
glucosidase) . The increase in absorbance at 405 nm as a 
result of p-nitrophenol (pNp) liberation was followed on a 
Hitachi U-1100 spectrophotometer, equipped with a 
thermostatted cuvette holder. The assays may be performed at 
BO^C or 90**C in closed 1-ml quartz cuvette. A standard 
reaction mixture contains 150 mM trisodium substrate, pH 5,0 
(at 80**C) , and 0-95 mM pNp derivative pNp «= 0.561 mM** • cm**) . 
The reaction mixture is allowed to reach the desired 
temperature, after which the reaction is started by injecting 
an appropriate amount of enzyme (1.06 ml final volume). 

1 U ^-glucosidase activity is defined as that amount 
required to catalyze the formation of 1.0 fimol pNp/min. D- 
cellobiose may also be used as a substrate. 

An ONPG assay for ^-galactosidase activity is described 
by Miller, J.H. (1992) A Short Course in Bacterial Genetics 
and Mill, J.H. (1992) Experiments in Molecular Genetics, the 
contents of which are hereby incorporated by reference in 
their entirety. 

A qucintitative fluorometric assay for jS-galactosidase 
specific activity is described by : Yotingman P., (1987) 
Plasmid Vectors for Recovering and Exploiting Tn917 
Transpositions in Bacillus and other Gram- Positive Bacteria. 
In Plasmids: A Practical approach (ed. K. Hardy) pp 79-103. 
IRL Press, Oxford. A description of the procedure can be 
found in Miller (1992) p. 75-77, the contents of which are 
incorporated by reference herein in their entirety. 

The polynucleotides of the present invention may be in 
the form of DMA which DNA includes cDNA, genomic DNA, and 
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synthetic DNA. The DNA may be double -stranded or single- 
stranded, and if single stranded may be the coding strand or 
non-coding (anti-sense) strand. The coding sequences which 
encodes the mature enzymes may be identical to the coding 
sequences shown in Figures 1-8 (SEQ ID NOS:l-8) or may be a 
different coding sequence which coding sequence, as a result 
of the redundancy or degeneracy of the genetic code, encodes 
the same mature enzymes as the DNA of Figures 1-14 (SEQ ID 
NOS:l-14) . 

The polynucleotide which encodes for the mature enzyme 
of Figures 1-14 (SEQ ID NOS: 15-28) may include, but is not 
limited to: only the coding sequence for the mature enzyme; 
the coding sequence for the mature enzyme and additional 
coding sequence such as a leader sequence or a proprotein 
sequence; the coding sequence for the mature enzyme (and 
optionally additional coding sequence) and non-coding 
sequence, such as introns or non-coding sequence 5' and/or 3' 
of the coding sequence for the mattire enzyme. 

Thus, the term "polynucleotide encoding an enzyme 
(protein)" encompasses a polynucleotide which includes only 
coding sequence for the enzyme as well as a polynucleotide 
which includes additional coding and/or non-coding sequence - 

The present invention further relates to variants of the 
hereinabove described polynucleotides which encode for 
fragments, analogs and derivatives of the enzymes having the 
deduced amino acid sequences of Figures 1-14 (SEQ ID NOS : 15- 
28) . The variant of the polynucleotide may be a naturally 
occurring allelic variant of the polynucleotide or a non- 
naturally occurring variant of the polynucleotide . 

Thus, the present invention includes polynucleotides 
encoding the same mature enzymes as shown in Figures 1-14 
(SEQ ID NOS:15-28) as well as variants of such 
polynucleotides which variants encode for a fragment, 
derivative or analog of the enzymes of Figures 1-14 (SEQ ID 
NOS: 15-28). Such nucleotide variants include deletion 
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variants , eubstitution variants and addition or insertion 
variants . 

As hereinabove indicated, the polynucleotides may have 
a coding sequence which is a naturally occurring allelic 
variant of the coding sequences shown in Figures 1-14 (SEQ ID 
NOS:l-14) . As known in the art, an allelic variant is an 
alternate f orm of a polynucleotide sequence which may have a 
substitution, deletion or addition of one or more 
nucleotides, which does not substantially alter the function 
of the encoded enzyme. 

Fragments of the full length gene of the present 
invention may be used as a hybridization probe for a cDNA or 
a genomic library to isolate the full- lengrth DNA and to 
isolate other DNAs which have a high sequence similarity to 
the gene or similar biological activity. Probes of this type 
preferably have at least 10, preferably at least 15, and even 
more preferably at least 30 bases and may contain, for 
example, at least 50 or more bases. The probe may also be 
used to identify a DNA clone corresponding to a full length 
transcript and a genomic clone or clones that contain the 
complete gene including regulatory cuid promoter regions, 
exons, and introns . An example of a screen comprises 
isolating the coding region of the gene by using the known 
DNA sequence to synthesize cm oligonucleotide probe. Labeled 
oligonucleotides having a sequence coit?3lementary to that of 
the gene of the present invention are used to screen a 
library of genomic DNA to determine which members of the 
library the probe hybridizes to. 

The present invention further relates to 
polynucleotides which hybridize to the hereinabove -described 
sequences if there is at least 70%, preferably at least 90%, 
and more preferably at least 95% identity between the 
sequences . The present invention particularly relates to 
polynucleotides which hybridize under stringent conditions to 
the hereinabove-described polynucleotides. As herein used, 
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the term "stringent conditions" means hybridization will 
occur only if there is at least 95% and preferably at least 
97% identity between the sequences. The polynucleotides 
which hybridize to the hereinabove described polynucleotides 
in a preferred embodiment encode enzymes which either retain 
substcoitially the same biological function or activity as the 
mature enzyme encoded by the DNA of Figures 1-14 (SEQ ID 
NOS:l-14) . 

Alternatively, the polynucleotide may have at least 15 
bases, preferably at least 30 bases, and more preferably at 
least 50 bases which hybridize to any part of a 
polynucleotide of the present invention cmd which has an 
identity thereto, as heredLnabove described, and which may or 
may not retain activity. For example, such polynucleotides 
may be employed as probes for the polynucleotides of SEQ ID 
NOS:l-14, for example, for recovery of the polynucleotide or 
as a diagnostic probe or as a PC!R primer • 

Thus, the present invention is directed to 
polynucleotides having at least a 70% identity, preferably at 
least 90% identity and more prefersd^ly at least a 95 V 
identity to a polynucleotide which encodes the enzymes of SEQ 
ID NOS: 15-28 as well as fragments thereof, which fragments 
have at least 15 bases, preferably at least 3 0 bases and most 
prefereODly at least 50 bases, which fragments are at least 
90% identical, preferably at least 95% identical and most 
preferably at least 97% identical under stringent conditions 
to any portion of a polynucleotide of the present invention. 

The present invention further relates to enzymes which 
have the deduced amino acid secpaences of Figures 1-14 (SEQ ID 
NOS: 15-28) as well as fragments, analogs and derivatives of 
such enzyme. 

The terms "fragment," "derivative" and "analog" when 
referring to the enzymes of Figures 1-14 (SEQ ID NOS: 15-28) 
means enzymes which retain essentially the same biological 
function or activity as such enzymes. Thus, an suialog 
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includes a proprotein which can be activated by cleavage of 
the proprotein portion to produce an active mature enzyme . 

The enzymes of the present invention may be a 
recombinant enzyme, a natural enzyme or a synthetic enzyme, 
preferably a recombinant enzyme. 

The fragment, derivative or analog of the enzymes of 
Figures 1-14 (SEQ ID NOS: 15-28) may be (i) one in which one 
or more of the amino acid residues are substituted with a 
conserved or non-conserved amino acid residue {preferably a 
conserved amino acid residue) amd such substituted amino acid 
residue may or may not be one encoded by the genetic code, or 
(ii) one in which one or more of the amino acid residues 
includes a sx±^stituent group, or (iii) one in which the 
mature enzyme is fused with another compound, such as a 
compound to increase the half -life of the enzyme (for 
example, polyethylene glycol), or (iv) one in which the 
additional amino acids are fused to the mature enzyme, such 
as a leader or secretory sequence or a sequence which is 
employed for purification of the mature enzyme or a 
proprotein sequence. Such fragments, derivatives and analogs 
are deemed to be within the scope of those skilled in the art 
from the teachings herein. 

The enzymes and polynucleotides of the present invention 
are preferably provided in an isolated form, and preferably 
are purified to homogeneity. 

The term "isolated" means that the material is removed 
from its original environment (e.g* , the natural environment 
if it is naturally occurring) . For example, a naturally- 
occurring polynucleotide or enzyme present in a living animal 
is not isolated, but the same polynucleotide or enzyme, 
separated from some or all of the coexisting materials in the 
natural system, is isolated. Such polynucleotides could be 
part of a vector and/or such polynucleotides or enzymes could 
be part of a composition, and still be isolated in that such 
vector or composition is not part of its natural environment. 
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The enzymes of the present invention include the enzymes 
of SEQ ID NOS; 15-28 (in particular the mature enzyme) as well 
as enzymes which have at least 70% similarity (preferably at 
least 70% identity) to the enzymes of SEQ ID NOS: 9-16 and 
more preferably at least 90% similarity (more preferably at 
least 90% identity) to the enzymes of SEQ ID NOS: 15-28 and 
still more preferably at least 95% similarity (still more 
preferably at least 95% identity) to the enzymes of SEQ ID 
NOS: 9-16 and also include portions of such enzymes with such 
portion of the enzyme generally containing at least 30 amino 
acids and more preferably at least 50 amino acids. 

As known in the art "similarity" between two enzymes is 
determined by comparing the amino acid sequence and its 
conserved amino acid substitutes of one enzyme to the 
sequence of a second enzyme. 

A variant, i.e. a -fragment", "analog" or "derivative" 
polypeptide, and reference polypeptide may differ in amino 
acid sequence by one or more substitutions, additions, 
deletions, fusions and truncations, which may be present in 

any combination. 

Among preferred variants are those that vary from a 
reference by conservative amino acid substitutions. Such 
substitutions are those that substitute a given amino acid in 
a polypeptide by another amino acid of like characteristics. 
Typically seen as conservative substitutions are the 
replacements, one for another, among the aliphatic amino 
acids Ala, Val, Leu and lie; interchange of the hydroxyl 
residues Ser and Thr, exchange of the acidic residues Asp and 
Glu, substitution between the amide residues Asn and Gin, 
exchange of , the basic residues Lys and Arg and replacements 
among the aromatic residues Phe, Tyr. 

Most highly preferred are variants which retain the same 
biological function and activity as the reference polypeptide 
from which it varies. 
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Pragmento or portione of the enzymes of the present 
invention may be employed for producing the corresponding 
full-length enzyme by peptide synthesis; therefore, the 
fragments may be employed as intermediates for producing the 
full-length enzymes. Fragments or portions of the 

polynucleotides of the present invention may be used to 
synthesize full-length polynucleotides of the present 
invention - 

The present invention also relates to vectors which 
include polynucleotides of the present invention, host cells 
which are genetically engineered with vectors of the 
invention and the production of enzymes of the invention by 
recombinant techniques . 

Host cells are genetically engineered (transduced or 
transformed or transfected) with the vectors of this 
invention which may be, for example, a cloning vector or an 
expression vector. The vector may be, for example, in the 
form of a plasmid, a viral particle, a phage, etc. The 
engineered host cells can be cultured in conventional 
nutrient media modified as appropriate for activating 
promoters , selecting trans formants or amplifying the genes of 
the present invention. The culture conditions, such as 
temperature, pH and the like, are those previously used with 
the host cell selected for expression, and will be apparent 
to the ordinarily skilled artiscoi. 

The polynucleotides of the present invention may be 
employed for producing enzymes by recombinant techniques , 
Thus, for example, the polynucleotide may be included in any 
one of a variety of expression vectors for expressing an 
enzyme. Such vectors include chromosomal, nonchromosomal and 
synthetic DNA sequences, e.g., derivatives of SV40; bacterial 
plasmids; phage DNA; baculovirus; yeast plasmids; vectors 
derived from combinations of plasmids and phage DNA, viral 
DNA such as vaccinia, adenovirus, fowl pox virus, and 
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pseudorcdDies . However, any other vector may be used as long 
as it is replicable and viable in the host. 

The appropriate DNA sequence may be inserted into the 
vector by a variety of procedures. In general, the DNA 
sequence is inserted into an appropriate restriction 
endonuclease site(s) by procedures known in the art. Such 
procedures and others are deemed to be within the scope of 
those skilled in the art. 

The DNA sequence in the expression vector is operatively 
linked to an appropriate egression control sequence (s) 
(promoter) to direct mRNA synthesis. As representative 
examples of such promoters, there may be mentioned: LTR or 
SV4 0 promoter, the E. coli . lac or trp , the phage lambda 
promoter and other promoters known to control expression of 
genes in prokaryotic or eukaryotic cells or their viruses. 
The expression vector also contains a ribosome binding site 
for translation initiation and a transcription terminator. 
The vector may also include appropriate sequences for 
amplifying expression. 

In addition, the expression vectors preferably contain 
one or more selectEible marker genes to provide a phenotypic 
trait for selection of transformed host cells such as 
dihydrof olate reductase or neomycin resisteuice for eukaryotic 
cell culture, or such as tetracycline or ampicillin 
resistance in E. coli . 

The vector containing the appropriate DNA sequence as 
hereinabove deeicribed, as well as an appropriate promoter or 
control sequence, may be en^loyed to transform an appropriate 
host to permit the host to express the protein . 

As representative examples of appropriate hosts, there 
may be mentioned: bacterial cells, such as E. coli , 
Streptomvces , Bacillus subtilis ; fungal cells, such as yeast; 
insect cells such as Drosophila S2 and Spodootera Sf 9 ; animal 
cells such as CHO, COS or Bowes melanoma; adenoviruses; plant 
cells, etc. The selection of an appropriate host is deemed 
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to be within the scope of those skilled in the art from the 
teachings herein. 

More particularly, the present invention also includes 
recombinant constructs comprising one or more of the 
sequences as broadly described cd>ove. The constructs 
comprise a vector, such as a plasmid or viral vector, into 
which a sequence of the invention has been inserted, in a 
forward or reverse orientation. In a preferred aspect of this 
embodiment, the construct further comprises regulatory 
sequences, including, for example, a promoter, operably 
linked to the secpience. Large numbers of suitable vectors 
and promoters are known to those of skill in the art, and are 
commercially available. The following vectors are provided 
by way of example; Bacterial: pQE70, pQE60, pQE-9 (Qiagen) , 
pDlO, psiX174, pBluescript II KS , pNH8A, pNHlGa, pNHlBA, 
pNH46A (Stratagene) ; ptrc99a, pKK223-3, pKK233-3, pDR540, 
pRITS (Pharmacia); Bukaryotic: pSV2CAT, pOG44 , pXTl, pSG 
(Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia). However, 
any other plasmid or vector may be used as long as they are 
replicable and viable in the host. 

Promoter regions can be selected from any desired gene 
using CAT {chloramphenicol transferase) vectors or other 
vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7 . Particular named bacterial promoters 
include lad, lacZ, T3, T7, gpt, lambda Pr, Pl and trp. 
Eukaryotic promoters include C34V immediate early, HSV 
thymidine kinase, early and late SV40, LTRs from retrovirus, 
and mouse metallothionein-I . Selection of the appropriate 
vector and promoter is well within the level of ordinary 
skill in the art. 

In a further embodiment, the present invention relates 
to host cells containing the above -de scribed constructs. The 
host cell can be a higher exikaryotic cell, such as a 
mammalian cell, or a lower eukaryotic cell, such as a yeast 
cell, or the host cell can be a prokaryotic cell, such as a 
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bacterial cell . Introduction of the construct into the host 
cell can be effected by calcium phosphate transf ection, DEAE- 
Dextran mediated transf ection, or electroporation (Davis, L,, 
Dibner, M., Battey/ I., Basic Methods in Molecular Biology, 
(1986)). 

The constaructs in host cells can be used in a 
conventional manner to produce the gene product encoded by 
the recombinant sequence. Alternatively, the enzymes of the 
invention can be synthetically produced by conventional 
peptide synthesizers . 

Mature proteins can be expressed in mammalian cells, 
yeast, bacteria, or other cells under the control of 
appropriate promoters. Cell-free translation systems can 
also be employed to produce such proteins using RNAs derived 
from the DNA constructs of the present invention. 
Appropriate cloning and expression vectors for use with 
prokaryotic and exikaryotic hosts are described by Sambrook, 
et al . , Molecular Cloning : A Laboratory Manual , Second 
Edition, Cold Spring Harbor, N.Y. , (1989), the disclosure of 
which is hereby incorporated by reference. 

Transcription of the DNA encoding the enzymes of the 
present invention by higher eukaryotes is increased by 
inserting an enhancer sequence into the vector. Enhancers 
are cis-acting elements of DNA, usually about from 10 to 300 
bp that act on a promoter to increase its transcription. 
Examples include the SV40 enhamcer on the late side of the 
replication origin bp 100 to 270, a cytomegalovirus early 
promoter enhauicer, the polyoma enhancer on the late side of 
the replication origin, and adenovirus enhancers. 

Generally, recombinant expression vectors will include 
origins of replication and select£±>le markers permitting 
transformation of the host cell, e.g., the ampicillin 
resistance gene of E, coli and S. cere vis iae TRPl gene, and 
a promoter derived from a highly- expressed gene to direct 
transcription of a downstream structural sequence. Such 
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promoters can be derived from operons encoding glycolytic 
enzymes such as 3-phosphoglycerate kinase (PGK) , of-f actor, 
acid phosphatase, or heat shock proteins, among others. The 
heterologous structural sequence is assembled in appropriate 
phase with trajislation initiation smd termination sequences, 
and preferably, a leader sequence capsd^le of directing 
secretion of translated enzyme* Optionally, the heterologous 
sequence can encode a fusion enzyme including an N- terminal 
identification peptide imparting desired characteristics, 
e.g., stabilization or simplified purification of e3q)ressed 
recombinant product. 

Useful expression vectors for bacterial use are 
constructed by inserting a structural DNA sequence encoding 
a desired protein together with suitable translation 
initiation and termination signals in operable reading phase 
with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of 
replication to ensure maintensmce of the vector and to, if 
desirable, provide amplification within the host. Suitable 

prokaryotic hosts for transformation include coli. 

Bacillus subtilis . Salmonella tvphimurium and various species 
within the genera Pseudomonas, Streptomyces , and 
Staphylococcus , although others may also be employed as a 
matter of choice. 

As a representative but nonlimiting example, useful 
expression vectors for bacterial use can coxtrprise a 
selectable marker and bacterial origin of replication derived 
from commercially available plasmids conqprising genetic 
elements of the well known cloning vector pBR322 (ATCC 
37017). Such commercial vectors include, for example, 
pKK223-3 (Phairmacia Fine Chemicals, Uppsala, Sweden) and GEMl 
(Promega Biotec, Madison, WI, USA) . These pBR322 "backbone" 
sections are combined with an appropriate promoter and the 
structural sequence to be expressed. 
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Following transformation of a suitable host strain and 
growth of the host strain to an appropriate cell density, the 
selected promoter is induced by appropriate means (e.g., 
temperature shift or chemical induction) and cells are 
cultured for an additional period. 

Cells are typically hairvested by centrif ugation, 
disrupted by physical or chemical means, and the resulting 
crude extract retained for further purification. 

Microbial cells employed in expression of proteins can 
be disrupted by any convenient method, including freeze -thaw 
cycling, sonication, mechanical disruption, or use of cell 
lysing agents, such methods are well known to those skilled 
in the art. 

various mammalian cell culture systems can also be 
employed to express recombinant protein. Examples of 
mammalian expression systems include the COS-7 lines of 
monkey kidney fibroblasts, described by Gluzman, Cell. 23:175 
(1981), and other cell lines capable of expressing a 
compatible vector, for example, the C127. 3T3, CHO, HeLa and 
BHK cell lines. Mammalian expression vectors will comprise 
an origin of replication, a suitable promoter and enhancer, 
and also any necessary ribosome binding sites, 
polyadenylation site, splice donor and acceptor sites, 
transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the 
SV4 0 splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. 

The enzyme can be recovered and purified from 
recombinant cell cultures by methods including ammonium 
sulfate or ethanol precipitation, acid extraction, anion or 
cation exchange chromatography, phosphocellulose 
chromatography, hydrophobic interaction chromatography, 
affinity chromatography, hydroxylapatite chromatography and 
lectin chromatography. Protein refolding steps can be used, 
as necessary, in completing configuration of the mature 
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protein. Finally, high performance liquid chromatography 
(HPLC) can be employed for final purification steps. 

The enzymes of the present invention may be a naturally 
purified product, or a product of chemical synthetic 
procedures, or produced by recombinant techniques from a 
prokaryotic or eukairyotic host (for exan^le, by bacterial, 
yeast, higher plant, insect and mammalian cells in culture) . 
Depending upon the host employed in a recombinant production 
procedure , the enzymes of the present invention may be 
glycosylated or may be non-glycosylated. Enzymes of the 
invention may or may not also include an initial methionine 
amino acid residue. 

/3-galactosidase hydrolyzes lactose to galactose and 
glucose. Accordingly, the OC1/4V, 9N2-31B/G, AEDII12RA-18B/G 
and F1-12G enzymes may be en?)loyed in the food processing 
industry for the production of low lactose content milk and 
for the production of galactose or glucose from lactose 
contained in whey obtained in a large amoxmt as a by-product 
in the production of cheese. Generally, it is desired that 
enzymes used in food processing, such as the aforementioned 
/3-galactosidases, be stable at elevated ten5>eratures to help 
prevent microbial contamination. 

These enzymes may also be en^loyed in the pharmaceutical 
industry. The enzymes are used to treat intolerance to 
lactose. In this case, a thermostable enzyme is desired, as 
well. Thermostable 0-galactosidases also have uses in 
diagnostic applications, where they are eTi5)loyed as reporter 
molecules . 

Glucosidases act on soluble cellooligosaccharides from 
the non-reducing end to give glucose as the sole product. 
Glucanases (endo- and exo-) act in the depolymerization of 
cellulose, generating more non- reducing ends (endo- 
glucanases, for instance, act on internal linkages yielding 
cellobiose, glucose and cellooligosaccharides as products). 
)3 -glucosidases are used in applications where glucose is the 
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desired product. Accordingly, MllTL, F1-12G, GC74-22G and 
MSB8-6G (and OC1/4V, VC1-7G1, 9N2-31B/G and AEDII12RA18B/G) 
may be employed in a wide variety of industrial applications, 
including in com wet milling for the separation of starch 
and gluten, in the fruit industry for clarification and 
equipment maintenance, in baking for viscosity reduction, in 
the textile industry for the processing of blue jeans, and in 
the detergent industry as an additive. For these and other 
applications, thermostable enzymes are desirable. 

Antibodies generated against the enzymes conresponding 
to a sequence of the present invention can be obtained by 
direct injection of the enzymes into an animal or by 
administering the enzymes to an animal, preferably a 
nonhuman. The antibody so obtained will then bind the 
enzymes itself. In this manner, even a sequence encoding 
only a fragment of the enzymes can be used to generate 
antibodies binding the whole native enzymes. Such antibodies 
can then be used to isolate the enzyme from cells expressing 
that enzyme • 

For preparation of monoclonal antibodies , any technique 
which provides antibodies produced by continuous cell line 
cultures can be used. Examples include the hybridoma 
technique (Kohler and Milstein, 1975, Nature, 256:495-497), 
the trioma technique, the human B-cell hybridoma technique 
(Kozbor et al., 1983, Immunology Today 4:72), and the EBV- 
hybridoma technique to produce human monoclonal antibodies 
(Cole, et al., 1985, in Monoclonal Antibodies and Cancer 
Therapy, Alan R. Liss, Inc., pp- 77-96). 

Techniques described for the production of single chain 
antibodies (U.S. Patent 4,946,778) can be adapted to produce 
single chain antibodies to immunogenic enzyme products of 
this invention. Also, transgenic mice may be used to express 
humanized antibodies to immunogenic enzyme products of this 
invention. 
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Antibodies generated against the enzyme of the present 
invention may be used in screening for similar enzymes from 
other organisms and samples . Such screening techniques are 
known in the art, for exanple, one such screening assay is 
described in "Methods for Measuring Cellulase Activities", 
Methods in Gnzymolagy, Vol 160, pp. 87-116, which is hereby 
incorporated by reference in its entirety. 

The present invention will be further described with 
reference to the following examples; however, it is to be 
understood that the present invention is not limited to such 
examples. All parts or amounts, imless otherwise specified, 
are by weight . 

In order to facilitate understanding of the following 
examples certain frequently occurring methods and/or terms 
will be described. - 

"Plasmids" are designated by a lower case p preceded 
and/or followed by capital letters and/or numbers. The 
starting plasmids herein are either commercially available, 
publicly available on an unrestricted basis, or can be 
constructed from available plasmids in accord with published 
procedures. In addition, equivalent plasmids to those 
described are known in the airt and will be apparent to the 
ordinarily skilled artisan. 

"Digestion" of DNA refers to catalytic cleavage of the 
DNA with a restriction enzyme that acts only at certain 
sequences in the DNA. The various restriction enzymes used 
herein are commercially available and their reaction 
conditions, cof actors and other requirements were used as 
would be known to the ordinarily skilled artisan. For 
analytical purposes , typically 1 fig ot plasmid or DNA 
fragment is used with about 2 units of enzyme in about 20 fil 
of buffer solution. For the purpose of isolating DNA 
fragments for plasmid construction, typically 5 to 50 ^9 of 
DNA are digested with 20 to 250 units of enzyme in a larger 
volume. Appropriate buffers aoid substrate amounts for 



•32- 



BNSCXDCrD: <WO 97254 17A1 l_> 



wo 97/25417 



PCT/US97/00092 



particular restriction enzymes are specified by the 
manufacturer. Incubation times of about 1 hour at 37 *C are 
ordinarily used, but may vary in accordance with the 
supplier's instructions. After digestion the reaction is 
electrophoresed directly on a polyacrylamide gel to isolate 
the desired fragment . 

Size separation of the cleaved fragments is performed 
using 8 percent polyacrylamide gel described by Goeddel, D. 
et al.. Nucleic Acids Res., 8:4057 (1980), 

"Oligonucleotides" refers to either a single stranded 
polydeoxynucleotide or two complementary polydeoxynucleotide 
strands which may be chemically synthesized. Such synthetic 
oligonucleotides have no 5' phosphate and thus will not 
ligate to another oligonucleotide without adding a phosphate 
with an ATP in the presence of a kinase. A synthetic 
oligonucleotide will ligate to a fragment that has not been 
dephosphorylated . 

"Ligation" refers to the process of forming 
phosphodiester bonds between two doxible stranded nucleic acid 
fragments (Maniatis, T., et al., Id., p- 146). Unless 
otherwise provided, ligation may be accomplished using known 
buffers and conditions with 10 units of T4 DNA ligase 
("ligase") per 0.5 fig of approximately equimolar amoxints of 
the DNA fragments to be ligated. 

Unless otherwise stated, transformation was performed as 
descrdLbed in the method of Graham, F. and Van der Eb, A., 
Virology, 52:456-457 (1973). 

Example 1 

Bacterial Expression and Purification of Glv cosidase Enzymes 
DNA encoding the enzymes of the present invention, SEQ 
ID NOSil through 8, were initially amplified from a 
pBluescript vector containing the DKA by the PGR technique 
using the primers noted herein. The amplified sequences were 
then inserted into the respective PQE vector listed beneath 
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the primer sequences, and the enzyme was expressed according 
to the protocols set forth herein. The 5' and 3' primer 
sequences for the respective genes are as follows: 

TheirmococcuB AEDII12RA -IBB/G 

5 ' CCGAGAA.TTCATTAAAGAGGAGAAATTJACTATGGTGAATGCTATGATTGTC 
(SEQ ID NO:29) 

3' CGGAAGATCTTCATAGCTCCGGAAGCCCATA (SEQ ID NO: 30) 

Vector: pQE12; and contains the following restriction enzyme 

sites 5' EcoRI and 3 ' Big II. 

OC1/4V-33B/G 

5 ' CCGAGAATTCATTAAAGAGGAGAAATTAACTATGATAAGAAGGTCCGATTTTCC 

(SEQ ID NO: 31) 

3' CGGAAGATCTTTAAGATTTTAGAAATTCCTT (SEQ ID NO: 32) 

vector : pQE12 ; and contains the following restriction enzyme 

sites 5' EcoRI and 3' Bgl II. 

Thexmococcus 9N2 - 31B/G 

5' CCGAGAATTCATTAAAGAGGAGAAATTAACTATGCTACCAGAAGGCTTTCTC 
(SEQ ID N0:33) 

3' CGGAGGTACCTCACCCAAGTCCGAACTTCTC (SEQ ID NO: 34) 

Vector: pQE30; and contains the following restriction enzyme 

sites 5' EcoRI and 3' KpnI. 

St-aphylothexjauB marinuB Fl - 12G 

5 ' cCGAGAATTCATTAAAGAGGAGAAATTAACTATGATAAGGTTTCCTGiATTAT 
(SEQ ID NO: 35) 

3' CGGAAGATCTTTATTCGAGGTTCTTTAATCC (SEQ ID NO: 36) 

Vector: PQE12; and contains the following restriction enzyme 

sites 5' EcoRI and 3' Bgl II. 

Thermococcus chi tonopJiagrus GC74 - 22G 

5 ' cCQAGAATTCATTaVTTAAAGAGGAGAAATTAACTATGCTTCCAGGAGAACTTTCTC 
(SEQ ID NO: 37) 
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3' CGGAGfGATCCCTACCCCTCCTCTAAGATCTC (SEQ ID NO: 38) 

Vector: pQE12; and contains the following restriction enzyme 

sites 5' EcoRI and 3' BamHI , 

MllTL 

5 ' AATAATCTAGAGCATGCAATTCCCCAAAGACrrTCATGATAG (SEQ ID NO : 3 9 ) 
3 ' AATAAAAGCTTACTGGATCAGTGTAAGATGCT ( SEQ ID NO : 4 0 ) 
Vector: pQE70; and contains the following restriction enzyme 
sites 5' SphI and 3' Hind III. 

Thennotoga. mcLritlma MSB8-6G 

5 ' CCGACAATTGATTAAAGAGGAGAAATTAACTATGGAAAGGATCGATGAAATT 
(SEQ ID NO: 41) 

3' CGGAGGTACCTCyVTGGTTTGAATCTCTTCTC (SEQ ID NO: 42) 

Vector: pQE12; and contains the following restriction enzyme 

sites 5' EcoRI and 3' KpnI. 

Pyrococcus furloBUB VCl - 7G1 

5 ' CCGACAATTGATTAAAGAGGAGAAATTAACTATGTTCCCTGAAAAGTTCCTT 
(SEQ ID NO: 43) 

3' CGGAGGTACCTCATCCCCTCAGCAATTCCTC (SEQ ID NO: 44) 

Vector: pQE12; and contains the following restriction enzyme 

sites 5' EcoRI and 3' Kpn I. 

BsmJzia gouldi endoglucanase (37GP1) 

5' AATAAGGATCCGTTTAGCGACGCTCGC 
(SEQ ID NO:45) 

3' AATAAAAGCTTCCGGGTTGTACAGCGGTAATAGGC (SEQ ID NO: 46) 

Vector: pQE52; and contains the following restriction enzyme 
' sites 5/ Bam HI and 3' Hind III. 

Thennotogra msLritima a-galactosidase (6GC2) 

5 ' TTTATTGAATTCATTAAAGAGGAGAWVTTAACTATGATCTGTGT^ 
(SEQ ID NO: 47) 

3' TCTATAAAGCTTTCATTCTCrCTCACCCTXrrTCGTAGAAG (SEQ ID NO: 48) 
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Vector: pQET; and contains the following restriction enzyme 
sites 5' EcoRI and 3' Hind III. 

Thermotoga atari tima S-mannanase (6GP2) 

5' TTTATTCAATTGATrAAAGAGGAGAAATTAACTATGGGGATTGGTGGCCa^C^ 
(SEQ ID NO: 49) 

3' TITATTAAGCTTATCTTTTCATATTCACATACCTCC (SEQ ID NO: 50 ) 
Vector: pQEt; and contains the following restriction enzyme 
sites 5' Hind III and 3' EcoRI. 
AEPII la fi-memnemase (63GB1) 

5 ' TTTATTGAATTCATTAAAGAGGACaJUlTTAACTATGCTACCAGAAGACnTCCTATGGGGC 
(SEQ ID NO:51) 

3' TTTATTAAGCTTCTCATCAACGGCTATGGTCTTCATTTC (SEQ ID NO: 52) 

Vector: pQEt; and contains the following restriction enzyme 
sites 5' Hind III and 3 ' EcoRI. 
OC1/4V endogluceuaase (33GP1) 

5' AAAAAAOJOTGAATTCATTAAAGAGGACaA^^ 
(SEQ ID NO: S3) 

3' TTTTTCWaTCCRATTCTTCATTTACICTTTGCCTG (SEQ ID NO: 54) 

vector: pQEt; and contains the following restriction enzyme 
sites 5' BamHI and 3' EcoRI. 

Thermotoffa xaaritdjoa pullalanase (6GP3) 

5' TITTGGAATTC3lTTAAAGA«3AGAAATTAACTATGGAACTGATCATAGAAGGrre 
(SEQ ID NO: 55) 

3' ATAACSAAGCTTlTCACTCTCrGTACAGAACGTACGC (SEQ ID NO: 56) 

vector: pQEt; and contains the following restriction enzyme 
sites 5' EcoRI and 3' Hind III. 

The restriction enzyme sites indicated correspond to the 
restriction enzyme sites on the bacterial expression vector 
indicated for the respective gene (Qiagen, Inc. Chatsworth. 
CA) . The pQE vector encodes antibiotic resistance (Amp') , a 
bacterial origin of replication (ori) , an IPTG-regulatable 
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promoter operator (P/O) . a ribosome binding site (RBS) , a 6- 
His tag and restriction enzyme sites. 

The PQE vector was digested with the restriction enzymes 
indicated. The amplified sequences were ligated xnto the 
respective pQE vector and inserted in frame with the sequence 
encoding for the RBS. The ligation mixture was then used to 
,:ransform the E__coLi strain M15/pREP4 (Qiagen. Inc.) by 
electroporation. M15/pREP4 contains multiple copies of the 
plasmid PREP4, which expresses the lad repressor and also 
confers kananrycin resistance (Kan'). Transf ormants were 
identified by their ability to grow on plates and 

ampicillin/kanamycin resistant Colonies were -^^^^^-^ 
PlLmid DNA was isolated and confirmed by restrxctxon 
analysis. Clones containing the desired J^^^ 
grown overnight (O/N) in liquid culture xn LB med.a 
!upplemented with both ^ (100 ug/ml) and Kan (25 ug/ml) ^ 
The O/N culture was used to inoculate a large culture at a 
ratio of 1:100 to 1:250. The cells were grown to an optxcal 
density 600 (O.D.«>) of between 0.4 ° ' " "^^^/^^^ 

("Isopropyl-B-D.thiogalacto pyranoside-) was then added to a 
inal concentration of X n*.. XPTG induces by 
the lacl repressor, clearing the P/O leading .o increased 
gene expression. Cells were grown an extra 3 to 4 hours, 
cells were then harvested by centrifugation. ^ 

The primer sequences set out above may also be employed 
to isolate the target gene from the deposited materxal by 
hybridization techniques described above. 

Example 2 



clones . fci. 

A clone is isolated directly by screenxng the 
deposited material using the oligonucleotide primers set 
forth in Example 1 for the particular gene desired to be 
isolated. The specific oligonucleotides are synthesxzed 
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using an Applied Biosysteme DNA synthesizer. The 
oligonucleotides are labeled with "P- -ATP using T4 
polynucleotide kinase and purified according to a standard 
protocol (Maniatis et al . , Molecular Cloning: A Laboratory 
Manual, Cold Spring Harbor Press, Cold Spring, NY, 1982) . 
The deposited clones in the pBluescript vectors may be 
employed to tromsform bacterial hosts which are then plated 
on 1.5% agar plates to the density of 20,000-50,000 
pfu/150 mm plate. These plates are screened using Nylon 
membranes according to the standard screening protocol 
(Stratagene, 1993) . Specifically, the Nylon membrane with 
denatured and fixed DNA is prehybridized in 6 x SSC, 20 mM 
NaHjPO*, 0.4%SPS, S x Denhardt's 500 pg/ml denatured, 
sonicated salmon sperm DNA; and 6 x SSC, 0.1% SDS . After 
one hour of prehybridization, the membrane is h>^ridized 
with hybridization buffer 6XSSC, 20 mM NaHjP04, 0.4%SDS, 500 
ug/ml denatured, sonicated salmon sperm DNA with 1x1 o' 
cpm/ml "P-probe overnight at 42«'C. The membrane is washed 
at 45-50«»C with washing buffer 6 x SSC, 0.1% SDS for 20-30 
minutes dried and esqposed to Kodak X-ray film overnight. 
Positive clones are isolated and purified by secondary and 
tertiary screening . The purified clone is sequenced to 
verify its identity to the primer sequence. 

Once the clone is isolated, the two oligonucleotide 
primers corresponding to the gene of interest are used to 
amplify the gene from the deposited material. A polymerase 
chain reaction is carried out in 25 fil of reaction mixture 
with 0.5 ug of the DNA of the gene of interest. The 
reaction mixture is 1.5-5 mM MgClj, 0.01% (w/v) gelatin, 20 
/xM each of dATP, dCTP, dGTP, dTTP, 25 pmol of each primer 
and 0.25 Unit of Taq polymerase. Thirty five cycles of PGR 
(denaturation at 94»C for 1 min; amnealing at SS'C for 1 
min; elongation at 72°C for 1 min) are performed with the 
Perkin-Elmer Cetus automated thermal cycler. The amplified 
product is analyzed by agarose gel electrophoresis and the 
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DNA band with expected molecular weight is excised and 
purified. The PCR product is verified to. be the gene of 
interest by subcloning and sequencing the DNA product. The 
ends of the newly purified genes are nucleotide sequenced 
to identify full length sequences. Complete sequencing of 
full length genes is then performed by Exonuclease III 
digestion or primer walking. 

Example 3 

Scr-eenina for Galactosidas f^ Activity 
Screening procedures for a-galactosidase protein 
activity may be assayed for as follows: 

substrate plates were provided by a standard plating 
procedure. Dilute XI.1 -Blue MRP E coll host of (Stratagene 
Cloning Systems, La Jolla, CA) to O.D.«o= 1-0 "i^*^ ^ZY 
media. In 15 ml tubes, inoculate 200 /il diluted host cells 
with phage. Mix gently and incubate tubes at 37 «»C for 15 
min. Add approximately 3.5 ml LB top agarose (0.7%) 
containing ImM IPTG to each tube and pour onto all NYZ 
plate surface. Allow to cool and incubate at 37 "C 
overnight. The assay plates are obtained as substrate p- 
Nitrophenyl a-galactosidase (Sigma) (200 mg/100 ml) (100 mM 
NaCl, 100 mM Potassium- Phosphate) 1* (w/v) agarose. The 
plaques are overlayed with nitrocellulose and incubated at 
4 «C for 3 0 minutes whereupon the nitrocellulose is removed 
and overlayed onto the substrate plates. The substrate 
plates are then incubated at 70 "C for 20 minutes. 

jimpl e 4 

Sere*'Ti-<T i q of Clones £o7 - »«»»-nn«T.«ii« AetlvitV 
A solid phase screening assay was utilized as a 
primary screening method to test clones for B-mannanase 
activity. 

A culture solution of the Y1090-E- coli host strain 
(Stratagene Cloning Systems, La Jolla, CA) was diluted to 
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O,D.f^=1.0 with NZY media. The amplified library from 
ThexTtiotogsL maritima lambda gtll library was diluted in SM 
(phage dilution buffer): 5 x 10^ pfu//il diluted 1:1000 
then 1:100 to 5 x 10^ pfu/^il. Then 8 ftl of phage dilution 
(5 X 10^ pfu/^l) was plated in 200 fil host cells. They 
were then incubated in 15 ml tubes at 37 **C for 15 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at 
approximately 52 **C was added to each tube and the mixture 
was poured onto the surface of LB agar plates. The agar 
plates were then incubated at 37 '^C for five hours. The 
plates were replicated and induced with 10 mM IPTG- soaked 
Duralon-UV™ nylon membranes (Stratagene Cloning Systems, 
La Jolla, CA) overnight. The nylon membranes and plates 
were marked with a needle to keep their orientation and the 
nylon membranes were then removed and stored at 4 ^C. 

An Azo-galactomannan overlay was applied to the LB 
plates containing the lambda plaques. The overlay contains 
1% agarose, 50 mM potassium-phosphate buffer pH 7, 0.4% 
Azocarob-galactomannan. (Megazyme, Australia) . The plates 
were incubated at 72 **C. The Azocarob-galactomannan 
treated plates were observed after 4 hours then returned to 
incubation overnight. Putative positives were identified 
by clearing zones on the Azocarob-galactomannan plates. 
Two positive clones were observed. 

The nylon membranes referred to above, which 
correspond to the positive clones were retrieved, oriented 
over the plate and the portions matching the locations of 
the clearing zones for positive clones wre cut out. Phage 
was eluted from the membrane cut-out portions by soaking 
the individual portions in 500 fil SM (phage dilution 
buffer) and 25 fxl CHClj. 

Example 5 

gr^T-*>*>nin q of Clones for Mannoeidase Ac tivity 
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A solid phase screening assay was utilized as a 
pri»«rv screening a.«hod to .es. clones for S-nannosxdase 

"''TLlture solution of the VIO.O-E. """/"f " 

(Stratagene Cloning Systems. I.a aolla. CA) was diluted to 
; "„.l 0 with nedia. The amplified li.=rary fron 
AEP " la lambda gtll library was diluted in S« (phage 
^lution buffer, : S x lo' pfu/.l diluted 1:1000 then 1:100 
.o S X 10' pfu/Ml. Then e ,1 of phage dil^-^^- 

,S X 10' pfu/^1) was plated in 200 ^1 host cells. They 

^. ,-„i ,v*es at 37 'C for IS minutes. 

were then incubated in 15 ml tubes at 3/ ^ 

Approximately 4 ml of molten, LB top agarose " 

approS^ately 52 -C was added to each tube and ™ 

Z poured onto the surface of agar P^-«- 

plates were then incubated at 37 -C for ^^jU^'^'^^^^^ 

plates were replicated and induced with 10 

Lralon-w™ nylon membranes 'S"-"^-^ 

x,a aolla. CA, overnight. The nylon '-'^^^\2Jn'^\^' 
were marKed with a needle to teep their orientation and 
nylon membranes were then removed and stored at 4 ^ 
A p-nitrophenyl-E-D-manno-pyranoside overlay was 

, /to the LB plates containing the lambda plaques, 
applied to the potassium-phosphate 
The overlay contains IV agarose, =u . r 
buffer PH 7, 0.4* p-nitrophenyl-£-D-manno-pyranoside.^ 
.„=fralia) The plates were incubated at 72 
(Megazyme, Australia) . ane p „eated plates were 

The p-nitrophenyl-E-D-»anno-pyranoside 

Observed after 4 hour, then returned to -"^f ^""^^^^.^^ 
overnight. Putative positives were i^-'^*"^^^^;^^^""' 
zones on the p-nitrophenyl-B-D-manno-pyranoside plates. 

TWO positive clones were observed. 

The nylon membranes referred to above, which 
correspond to the positive clones were 

over the plate and the portions matching ^''^J-^^--^^^ 

the clearing zones for positive clones wre 

was eluted from the membrane cut-out portions by soaKing 
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the individual portions in 500 /il SM (phage dilution 
buffer) and 25 ^1 CHCI3 , 

Example € 
Screening for Pullulanaee Activity 

Screening procedures for pullulanase protein activity 
may be assayed for as follows: 

Substrate plates were provided by a standard plating 
procedure- Host cells are diluted to 0,D.«o = 1-0 with NZY 
or appropriate media. In 15 ml tubes, inoculate 200 /xl 
diluted host cells with phage. Mix gently and incubate 
tubes at 37 for 15 min. Add approximately 3.5 ml LB top 
agarose (0.7%) is added to each tube and the mixture is 
plated, allowed to cool, and incubated at 37<*C for about 28 
hours. Overlays of 4.5 mis of the following substrate are 
poured: 

100 ml total volume 

0. 5g Red Pullulan Red (Megazyme, Australia) 

1 . Og Agarose 

5ml Buffer (Tris-HCL pH 7.2 ® 75 ^C) 

2ml 5M NaCl 

5ml CaClj (lOOmM) 

85ml dHjO 

Plates are cooled at room teir^jerature , and thenm incubated 
at 75*C for 2 hours. Positives are observed as showing 
substrate degradation. 

Example 7 

Screening for Endoqlucanase Activity 
Screening procedures for endoglucauiase protein 

activity may be assayed for as follows : 

1. The gene library is plated onto G LB/GelRite/0 . 1% 

CMC/N2Y agar plates (-4,8 00 plaque forming units/plate) in 

E.coli host with LB agarose as top agax-ose . The plates are 

incubated at 37*C overnight. 
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2. Plates are Chilled at 4 for one hour. 

3. The plates are overlayed with Duralon membranes 
(Stratagene) at room temperature for one hour and the 
membranes are oriented and lifted off the plates and stored 
at 40c. 

4 . The top agarose layer is removed and plates are 
incubated at 37»C for ~3 hours. 

5. The plate surface is rinsed with NaCl. 
The plate is stained with 0.1% Congo Red for 15 



6. 

minutes 
7 . 
8 . 



The plate is destained with IM NaCl . 
The putative positives identified on plate are 
isolated from the Duralon membrane (positives are 
identified by clearing zones around clones) . The phage xs 
eluted from the membrane by incubating in 500;xl SM 25^1 

CHClj to elute. 

9. Insert DNA is subcloned into any approprxate 
cloning vector and subclones are reassayed for CMCase 
activity using the following protocol: 

i) Spin iml overnight miniprep of clone at 

maximum speed for 3 minutes. 

ii) Decant the supernatant and use it to fill 
"wells" that have been made in an LB/GelRite/0 . 1% CMC 
plate. 

iii) incubate at 37«»C for 2 hours. 

iv) Stain with 0.1% Congo Red for 15 minutes. 
V) Destain with IM NaCl for 15 minutes . 

vi) Identify positives by clearing zone around 

clone . 

Numerous modifications and variations of the present 
invention are possible in light of the above teachings and. 
therefore, within the scope of the appended claims, the 
invention may be practiced otherwise than as particularly 
described. 
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WHAT IS CLAIMED IS : 

1. An isolated polynucleotide comprising a member 
selected from the group consisting of: 

(a) a polynucleotide having at least a 70% 
identity to a polynucleotide encoding an enzyme comprising 
amino acid sequences set forth in SEQ ID NOS : 15-28; 

(b) a polynucleotide which is complementary to 
the polynucleotide of (a); and 

(c) a polynucleotide comprising at least 15 
bases of the polynucleotide of (a) or (b) . 

2 . The polynucleotide of Claim 1 wherein the 
polynucleotide is DNA. 

3. The polynucleotide of Claim 1 wherein the 
polynucleotide is RNA. 

4 , The polynucleotide of Claim 2 which encodes an 

enzyme comprising an amino acid sequence which a member 
selected from the group 



(a) 


according 


to 


SEQ 


ID 


NO: 


15; 


(b) 


according 


to 


SEQ 


ID 


NO: 


16; 


(c) 


according 


to 


SEQ 


ID 


NO: 


17; 


(d) 


according 


to 


SEQ 


ID 


NO:ie; 


(e) 


according 


to 


SEQ 


ID 


NO: 


19; 


(f ) 


according 


to 


SEQ 


ID 


NO: 


20; 


(g) 


according 


to 


SEQ 


ID 


NO: 


21; 


(h) 


according 


to 


SEQ 


ID 


NO: 


22; 


(i) 


according 


to 


SEQ 


ID 


NO: 


23; 


(j) 


according 


to 


SEQ 


ID 


NO: 


24; 


(k) 


according 


to 


SEQ 


ID 


NO: 


25; 


(1) 


according 


to 


SEQ 


ID 


NO: 


26; 


(m) 


according 


to 


SEQ 


ID 


NO: 


27; 


(n) 


according 


to 


SEQ 


ID 


NO 


:28. 
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5. An iaolated polynucleotide comprising a tnember 

selected from the group coneieting of: 

(a) a polynucleotide having at least a voir 
identity to a polynucleotide encoding an enzyme encoded by 
.te DKA contained in ATCC Deposit Ko. S7379, wherexn saxd 
enzyme is selected from the group consisting of MllTL^ 
OC1/4V, F1-12G. 9N2-31B/G, MSB8-6G, AEDII12RA-1BB/G. GC74 

220 and VC1-7G1; ,.>, 

(b) a polynucleotide complementary to tne 

polynucleotide of (a) ; and ' ^ 

(c) a polynucleotide comprising at least 15 

bases of the polynucleotide of (a) and (b) . 

A vector comprising the DNA of Claim 2. 



6. 
7 . 
8. 



A host cell coit5>rising the vector of Claim 6 



B A process for producing a polypeptide comprising: 

expressing from the host cell of Claim 7 a polypeptxde 
encoded by said DNA. 

o A process for producing a cell coa5>rising: 

..ansforming or transfecting the cell with the vector of 
Claim 6 such that the cell expresses the polypeptxde 
encoded by the DNA contained in the vector. 

10. An enzyme comprising a member selected from the 

^oup ^^"^^^^ ^^.^^ an amino acid secp^ence 
vhich is at least 70V identical to the amino acid sequence 
set forth in SEQ ID NOS: 15-28; and 

(b) an enzyme which comprises at least 30 amxno 

acid residues to the enzyme of (a) . 
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• ^ method for generating glucose from soluble 

cellooligosaccharideB conprising: 

administering an effective amount of an enyzme 
selected from the group consisting of an enzyme having the 
amino acid sequence set forth in SBQ ID NOS:15-28. 
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MllTl^ CLYCOSIDASE - 29G 
COMPLETE GENE SEQUENCE - 9/95 

rn: aaa rr« .ic aaa i:ai- -rrt ai«. ata •;*.» ta. rt a r'-r rt a i*i*t. rrr iaa rn i.aa . *: ..i 

M»M l.v:. rl».. I't.. I.ys A^:^• I'Im- Mto it<- -.Iv Ivf :»i ...| *;..| |'f> IMm- Kit) A).. '»» 

I.I i;<rr ati* c'-*- kcj: -nr- tiAc iiat aai- akt k.at vtu: n:t; i;ta Ti:t; trn; cat uat »;a<. i.mi 

.■ : i:Jv J U. »»rf. i:iv .'.^-r Clu A;:r^ t'f Asii S.I Asp Tip Trp Vh I Trp V*» I Hi:; Ani» t ;iit ^t> 

K'l AA»* Ai.A iU'A iifT r;f;A r-TA *rrc Ai;i* i:i;c i:at -rrr i rr cai; aac t;t;(.* lVA larr tai- n:t: aai iho 

U A.'.fi Thi Ai.» Alri <i\y Vnl Si-i iMy A.-p ffi.* »*i o C. I u Asn <;Jy o c;lv t t i |. Arin i.O 

JRJ TTA AAC CAA AAT GAC CAC CAl* CTC: CtCT GAG AA{; CTC CCC CTT AAC ACT ATT AUA iriA *;t;( J AO 

<»j i.i'u Asn CIn Asn Asp His Asp i.imi Alft i;iu Lys Leu Cly Val Asn Tin He Arq Vul UJy 80 

2 4 I CTT GAC TOG ACT ACG ATT TTT CCA AAG CCA ACT TTC AAT CTT AAA CTC CCT CTA CAC; ACA 300 

fll val Clu Trp Ser Arg lie Phe Pro tys I'l o Thr Phe Asn val Lys Vai Pro Val Clu Arg 100 

301 CAT GAC AAC CCC ACC ATT CTT CAC CTA CAT CTC CAT CAT AAA CCC CTT CAA ACA CTT CAT 360 

101 Asp Clu Asn Cly Ser He Val His Val Asp Val Asp Asp Lys Ala Val Clu Arg Leu Asp 120 

361 GAA TTA CCC AAC AAG CAC CCC CTA AAC CAT TAC CTA GAA ATG TAT AAA GAC TCC CTT CAA 4 20 

121 Giu Leu Ala Asn Lys Clu Ala Val Asn His Tyr Val Clu Met Tyr Lys Asp Trp Val Clu 140 

421 ACA CCT ACA AAA CTT ATA CTC AAT TTA TAC CAT TCC CCC CTC CCT CTC TCC CTT CAC AAC 4 80 

14 1 Arg Cly Arg Lys Leu He Leu Asn Leu Tyr His Trp Pro Leu Pro Leu Trp Leu His Asn 160 

461 CCA ATC ATG CTC ACA ACA ATC CCC CCC 1 1 AC AGA CCC CCC TCA CCC TCC CTT AAC GAG GAG b4 0 

161 Pro He Met Val Arg Arg Met Cly Pro Asp Arg Ala Pro Ser Gly Trp Leu Asn Glu Clu 180 

541 TCC CTC CTC GAG TTT GCC AAA TAC CCC CCA TAC ATT CCT TCC AAA ATC GGC CAC CTA CCT 600 

181 Ser Val Val Clu Phe Ala Lys Tyr Ala Ala Tyr He Ala Trp Lys Met Gly Clu Leu Pro 20O 

601 CTT ATG TGG ACC ACC ATC AAC CAA CCC AAC CTC CTT TAT GAC CAA CCA TAC ATC TTC CTT 660 

201 Val Met Trp Ser Thr Met Asn Clu Pro Asn Val Val Tyr Clu Gin Gly Tyr Met Phe Val 220 

661 AAA GGC CCT TTC CCA CCC CCC TAC TTG ACT TTC GAA CCT CCT GAT AAC CCC AGC AGA AAT 720 

221 Lys Cly Cly Phe Pro Pro Cly Tyr Leu Ser Leu Glu Ala Ala Asp Lys Ala Arg Arg Asn 240 

721 ATG ATC CAC CCT CAT CCA CCC CCC TAT CAC AAT ATT AAA CCC TTC ACT AAC AAA CCT CTT 780 

241 Met He Gin Ala His Ala Arg Ala Tyr Asp Asn He Lys Arg Phe Ser Lys Lys Pro Val 2 60 

781 GCA CTA ATA TAC CCT TTC CAA TCC TTC CAA CTA TTA GAC GGT CCA CCA CAA CTA TTT CAT 840 

261 Cly Leu He Tyr Ala Phe Gin Trp Phe Clu Leu Leu Glu Gly Pro Ala Clu Val Phe Asp 280 

841 AAC TTT AAG ACC TCT AAC TTA TAC TAT TTC ACA CAC ATA CTA TCC AAC CCT ACT TCA ATC 900 

281 Lys Phe Lys Ser Ser Lys Leu Tyr Tyr Phe Thr Asp He Val Ser Lys Gly Ser Ser He 300 

901 ATC AAT CTT CAA TAC AGC ACA CAT CTT CCC AAT AGG CTA CAC TCC TTG CCC CTT AAC TAC 9 60 

301 He Asn Val Clu Tyr Arg Arg Asp Leu Ala Asn Arg Leu Asp Trp Leu Gly Val Asn Tyr 320 

961 TAT AGC CCT TTA CTC TAC AAA ATC CTC CAT CAC AAA CCT ATA ATC CTC CAC CCC TAT CCA 1020 

321 Tyr Ser Arg Leu Val Tyr Lys He Val Asp Asp Lys Pro He He Leu His Cly Tyr Cly 340 

1021 TTC CTT TCT ACA CCT CCC GCC ATC ACC CCC CCT CAA AAT CCT TCT AGC CAT TTT CCC TCC 1080 

341 Phe Leu Cys Thr Pro Cly Gly He Ser Pro Ala Clu Asn Pro Cys Ser Asp Phe Cly Trp 360 

lORl CAC CTC TAT CCT CAA CCA CTC TAC CTA CTT CTA AAA GAA CTT TAC AAC CCA TAC CCC CTA H40 

301 Clu Val Tyr Pro Clii Cly Leu Tyr Leu Leu Leu Ly.s Clu Ufu Tyr Asn Arg Tyr Cly Vai 380 

114 1 CAC TTC Arc CTX; AlV lUii; AAC GGT CTT TCA CAC ACC A*;*; lIAT CCC TTC AGA CCC CCA TAC 1 2UO 

3B1 Asp Leu Hir VaJ Thi Asn Cly Va J Ser Asp Set Arc Asp Ala Leu Arg Pro Ala Tyi 400 

1:^0) CTt; rnt- rn; cat (rrr tac aoc cta t<:(; aaa gcc ctrr aai* cac (w:c att ccc ctc aaa net i^t>» 

40) I..MI Val '..•t His V.»l Tyi Sct V#i i Tip Lys Ala Al** a-.h f:iij i;ly I U- Pro Val l.ys f ; I v 

i.itil TAt' trii- l Ac -nn; Ai;f rrt; aca t;At- aat tai* cai; t»u. tu t- i ai; iu;t* im- A<a; rrAi; aaa i » 

A'.n Tyi Hir: Ti|» StM l,i>ti Tlir A?;p Asrii Tyr Tr |» A!,. Clti i:iy I'h.- Arq t;it» l.y.*; PL* -^l-' 
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J *. : ..*.t i-iA i:*r. at*, (rrr ..a* aaa a. y aa.. 

-11' • »> '"w V.I I M..t V.^ I A-.|. II.. t.y rill I.-.-. 

I iMi i-r« .lu; »;a*; at*- (;i-a a« « . . aj a a. tu.A a»a 

14-11 1 A* : TA A 1440. 

4**1 i:itt Kiid 4H2 



AAA A- . TAT i , . . . a A. .. • .. • ' TA . : i . . i - »- 

l-V:- Alii Tv» Ai.| i-f- . , Al.i V.I i »• 

• *r f : I :at I : Ai : t ta • a. - - a-? r -rr a< a i -r* : at« - t -i ■: 

I'rn A»*|» Kill I..... -Li,. Mi- I..-IJ Till II*- -I'l' 



Figure 1 (Continued) 
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OCl/4 GLYCOSXDASE - 3 30/B 
COMPLETE GENE SEQUENCE - 9/95 

J AT<; ATA ACA ACC TCC GAT l-TT iCA AAA CAT TTT ATC TTC CCA A<;i: t:»n Ai i; lU A (U A TAC 



lie Arg Krg Ser asp IM^** I'i 



?. I 



i-AG ATT CXA OCT CCA CCA A^l* »:AA (iAT CCC ACA GGC CCA TCA ATT TCC KAT CTC ITT TCA 
r.ln He Clu Cly Ala Aia Asn <;lu Asp Cly Arg CJy Pro Ser lie Trp Asp I Phc Ser 

TAC ACC CCT GCC AAA ACC CTC AAC f/TT CAC ACA CCA CAC CTT CCC TtH* CAC CAT TAT CAC 
A\ Mis Thr Pro Cly Lys Thr Leu Asn C)y Asp Thr Cly Asp Val Ala Cys Asp Hxs Tyr His 

\H\ CCA TAC AAG CAA CAT ATC CAC CTC ATC AAA GAA ATA CCC TTA CAC CCT TAC ACC TTC TCT 
61 Arg Tyr Lys Clu Asp He Cln Leu Het Lys Clu lie Cly Leu Asp Ala Tyr Arg Phe Ser 

2M ATC TCC TCC CCC ACA ATT ATC CCA CAT CCC AAC AAC ATC AAC CAA AAC CCT CTC CAT TTC 
81 He Ser Trp Pro Arg lie Met Pro Asp Cly Lya Asn lie Asn Cln Lys Cly Val Asp Phe 

301 TAC AAC AGA CTC CTT CAT CAC CTT TTC AAC AAT GAT ATC ATA CCA TTC CTA ACA CTC TAT 
101 Tyr Asn Arg Leu Val Asp Clu Leu Leu Lys Asn Asp He He Pro Phe Val Thr Leu Tyr 

561 CAC TGG GAC TTA CCC TAC CCA CTT TAT CAA AAA CCT CCA TCC CTT AAC CCA GAT ATA CCC 
121 His Trp ASP Leu Pro Tyr Ala Leu Tyr Clu Lys Cly Cly Trp Leu Asn Pro Asp lie Ala 

421 CTC TAT TTC ACA CCA TAC CCA ACC TTT ATC TTC AAC CAA CTC CCT CAT CCT CTC AAA CAT 
141 Leu Tyr Phe Arg Ala Tyr Ala Thr Phe Met Phe Asn Clu Leu Cly Asp Arg Val Lys Mis 

461 TCC ATT ACA CTC AAC CAA CCA TCC TCT TCT TCT TTC TCC CCT TAT TAC ACC CCA CAC CAT 
161 Trp lie Thr Leu Asn Clu Pro Trp Cys Ser Ser Phe Ser Cly Tyr Tyr Thr Cly Clu His 

541 CCC CCC CCT CAT CAA AAT TTA CAA GAA CCC ATA ATC CCC CCC CAC AAC CTC TTC ACC CAA 
IBl Aia pro Gly His Cln Asn Leu Cln GXu Ala He He Ala Ala His Asn Leu Leu Arg Clu 

601 CAT CCA CAT CCC CTC CAC CCC TCC AGA GAA CAA CTA ^ CA'' ^ ^ ^ ™ J^^^ 



901 TTT GAT ATC AAC AAT CCT CTT CCA TTT TCC TAT CTT CAC CCA CAC CTT CCC AAA ACC CAC 

301 Phe Asp Met Asn Asn Pro Leu Cly Phe Ser Tyr Val Gin Gly Asp Leu Pro Lys Thr Clu 

961 ATC CGA TXX; GAA ATC TAC CCG CAG GGA TTA TTT CAT ATC CTC CTC TAT CTG AAC GAA ACA 

321 Met Cly Trp Clu He Tyr Pro Cln Cly Leu Phe Asp Met Leu Val Tyr Leu Lys Clu Arg 



1021 TAT AAA CTA CCA CTT TAT ATC ACA GAC AAC CCC ATC 



Tyr Lys Leu Pro Leu Tyr He Thr Giu Asn Cly Met Ala Cly Pro Asp Lys Leu Clu Asn 

CCA AGA CTT CAT GAT AAT TAC CCA ATT CAA TAT TTC CAA AAG CAC TTT GAA AAA CCA CTT 
Cly Arg Val His Asp Asn Tyr Arg He Clu Tyr Leu Clu Lys His Phe Clu Lys Ala Leu 

CAA CCA ATC AAT CCA CAT CTT CAT TTC AAA CCT TAC TTC ATT TCC TCT TTC ATC CAT AAC 
Clu Ala He Asn Ala Asp val Asp Leu Lys Cly Tyr Phe He Trp Ser Leu Met Asp Asn 

1201 TTC GAA TCC C«X TCC CCA TAC TCC AAA CCT TTC CCT ATA ATC TAC CTA CAT TAC AAT ACC 
4U1 



341 



1081 
361 



1141 
561 



Phe Clu Trp AlA Cys iily Tyr Ser Lys Arg Phe Cly He He Tyr Val Asp Tyr Asn Thr 

CCA AAA ACC ATA TTi: AAA 
421 Pro Lys Arg lit- i^'u Lys 



i.V?i Asp l»he He Phe Cly Thr Ai.i TUt Airt AM Tyr 20 



120 
40 

180 
60 

240 
80 

300 
100 

360 
120 

420 
140 

480 

160 

540 

180 

600 
200 

660 



201 His Cly His Ala Val Cln Ala Ser Arg Clu Clu Val Lys Asp Gly Clu Val Cly Leu Thr 220 

661 AAC CTT CTC ATG AAA ATA CAA CCC GCC GAT CCA AAA CCC GAA ACT TTC TTC CTC CCA ACT 720 

221 Asn Val Val Met Lys He Clu Pro Gly Asp Ala Lya Pro Glu Ser Phe Leu Val Ala Ser 240 

721 CTT err GAT AAC TTC CTT AAT CCA TGG TCC CAT CAC CCT CTT CTT TTC GGA AAA TAT CCC 

241 Leu Val Asp Lys Phe Val Asn Ala Trp S«r His Asp Pro Val Val Phe Gly Lys Tyr Pro 

781 CAA CAA CCA CTT CCA CTT TAT ACC GAA AAA CCC TTC CAA CTT CTC CAT ACC CAT ATG AAT 

261 Clu Clu Ala Val Ala Leu Tyr Thr Glu Lys Cly Leu Gin Val Leu Asp Ser Asp Met Asn 

641 ATT ATT TCC ACT CCT ATA GAC TTC TTT OCT CTC AAT TAT TAC ACA ACA ACA CTT CTT CTT 900 

281 He He Ser Thr Pro He Asp Phe Phe Cly Val Asn Tyr Tyr Thr Arg Thr Leu Val Val 300 



780 
260 



840 

280 



960 
320 

1020 
340 



CCT CCA CCT GAT AAA TTC GAA AAC lOBO 



360 

1140 
3B0 

1200 
400 

I26U 
<20 



1261 CCA AAA ACC ATA TTt; AAA GAT TCA CCC ATC TCC TTC AAG CAA TTT CTA AAA TCT TAA HI7 

Asp Ser Ala Met Trp Leu Lys c;lu Pht- Leu Lys Ser Rnd 4»» 



Figure 2 
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STAPRYLOTHERMDS MARIKUS GLYCOSIDASE - 12C 
COMPLETE CEKZ SEQUENCE 
9/95 

1 TTC ATA ACX7 TTT CCT CAT TAT TTf Tr.: TTT 0<:a A. A .;c.-r A< A TCA TC-C: < A.: At.- .:a.. .... 

1 Met Aro Phe Pro Asp Tyr Phe Ueu Ph,. f:lv Tt.r Ala Thr Set Scr ii.s c;i„ ii,- 

61 CCT AAT AAC ATA TTT AAT CAT TCC TCC CAC TCC CAC ACT AAA CCC ARC ATT AA.: irr,: ac;A I/O 

21 Cly Asn Asn He Phe A5n Asp Trp Trp Clu Trp Clu Thr Lys CJy Ar, lie Lys v«| Arq AO 

r° CCT AAC CCA TCT AAT CAT TCC CAA CTC TAT AAA CAA CAC ATA CAC CTT ATC CCT CAG IBO 

41 ser Cly Lys Al* Cys Asn His Trp Clu Leu Tyr Ly, clu Asp lie clu Leu Met Al. clu II 

^ CCT TAT ACC TTC TCC ATA CAC TCC ACT AGA ATA TTT CCC ACA AAA CAT 7*0 

61 Leu Cly Tyr Asn Al. Tyr Ar, Ph. Ser 11. Clu Trp Ser Ar„ Ue Phe Pro A^ !sp lo 

'el H^I r° f" '^^ ^" ^ CTT AAT CTA err ACA AAA TAC 300 

ei His lie ASP Tyr Clu Ser Leu Asn Lys Tyr Lys Clu lie Val Asn Leu Leu Arg Lys Tyr 100 



301 CCC ATA CAA CCT CTA ATC ACT CTT CAC CAC TTC ACA AAC CCC CAA TCC TTT ATC AAA ATT ISO 

101 Cly «. Clu Pro V.1 11. Thr Leu Hi. Hi. Ph. Thr Asn Pro Cln ^ h« il^^ ^ 

ll\ CGA TCC ACT ACC CAA CAC AAC ATA AAA TAT TTT ATA AAA TAT CTA CAA CTT ATA CCT 420 

121 Cly Cly Trp Thr Arg Clu Clu Asn 11. Lys Tyr Phe He Ly, Tyr V.l clu Leu 11. a" "o 

421 TCC CAC ATA AAA CAC CTC AAA ATA TCC ATC ACT ATT AAT CAA CCA ATA ATA TAT CTT TTA 480 

141 ser Clu lie Lys Asp Val Lys 11. Trp Ii. Thr He Asn Clu Pro He He Tyr Val Leu "o 

^ ^ rT" TTA AAA ATA CCT CAT CAA S40 

161 Cln Cly Tyr 11. s.r Cly Glu Trp Pro Pro Cly He Ly. Asn Leu Lys lie Ala Asp Cln 180 

f»J **0 **T CTT "A AAA CCA CAT AAT CAA CCC TAT AAT ATA CTT CAT AAA CAC OCT 



«Di % ^ . — - WW* A/vi /uii- AAA tTT CAT AAA CAC CC5T 

181 val Thr Lys Asn L.u Lmxi Lys Ala Hi. Asn Clu Ala Tyr Asn 11. L#u Hi. Ly. His Cly 



600 
200 



721 
241 


ACC 
Arg 


CCA GAA CTA CAA 
Cly Glu Leu Glu 


ACT 
Thr 


CTC 
Leu 


CCT CCA 
Arg Cly 


AAA 

Lys 


TAC 
Tyr 


CCA 
Arg 


CTT 
Val 


CAC 
Clu 


CCC CCA AAT 
Pro Cly Asn 


ATT 
He 


CAT 
Asp 


TTC 
Phe 


760 
260 


. 781 
261 


ATA 
lie 


CCC ATA AAC TAT 
Cly He Asn Tyr 


TAT 
Tyr 


TCA 
Ser 


TCA 
Ser 


TAT 

Tyr 


ATT 

He 


CTA 
Val 


AAA 

Lys 


TAT 

Tyr 


ACT 
Thr 


TCC 
Trp 


AAT CCT 
Asn Pro 


TTT 
Phe 


AAA 

Lys 


CTA 
Leu 


840 
280 


841 

281 


CAT 
His 


ATT AAA CTC CAA 
He Lys Val Glu 


CCA 
Pro 


TTA 
Leu 


CAT 
Asp 


ACA 

Thr 


CCT 
Cly 


CTA 
Leu 


TCC 
Trp 


ACA 

Thr 


ACT 
Thr 


ATC 
Hec 


CCT TAC 
Cly Tyr 


TCC 
Cys 


ATA 

He 


TAT 

Tyr 


900 
300 


901 
301 


CCT 
Pro 


AGA CCA ATA TAT 
Arg Cly He Tyr 


GAA 

Clu 


CTT 
Val 


CTA 
Val 


ATC 
Met 


AAA 

Lys 


ACT 

Thr 


CAT 
His 


CAC 
Clu 


AAA 

Lys 


TAC 
Tyr 


CCC AAA 
Cly Lys 


CAA 
Clu 


ATA 

He 


ATC 
He 


960 
320 


961 
321 


ATT 

He 


ACA CAC AAC CCT CTT CCA 
Thr Clu Asn Cly Val Ala 


CTA 
Val 


CAA 
Clu 


AAT 
Asn 


GAT 
Asp 


GAA 

Clu 


TTA 

Leu 


ACC 
Arg 


ATT 
He 


TTA TCC 
Leu Ser 


ATT 
He 


ATC 
He 


ACC 
Arg 


1020 
340 


1021 
341 


CAC 
His 


TTA CAA TAC TTA 
Leu Cln Tyr Leu 


TAT 

Tyr 


AAA 

Lys 


CCC 
Ala 


ATC 
Met 


AAT 

Asn 


CAA 

Glu 


CCA 

Cly 


CCA 
Ala 


AAG 

Lys 


CTC 
Val 


AAA CCA 

Lys Cly 


TAT 

Tyr 


TTC 
Phe 


TAC 
Tyr 


1080 
3 60 


1081 
361 


TCC 
Trp 


ACC TTC ATC CAT 
Ser Pne Hec Asp 


AAT 
Asn 


TTT 
Pbe 


GAG 
Glu 


TCC 
Trp 


CAT 

Asp 


AAA 

Lys 


CCA 
Cly 


TTT 
pne 


AAC 
Asn 


CAA 

Cln 


ACC TTC 
Arg Phe 


CCA 
Cly 


CTA 
Leu 


CTA 
val 


I 140 
380 


1141 
381 


CAA 
Glu 


CTT CAT TAT AAC 
Val Asp Tyr Lys 


ACT 
Thr 


TTT 
Phe 


CAC 
Clu 


ACA 

Arg 


AAA 

Lys 


CCT 
Pro 


ACA 
Arg 


AAA 

Lys 


ACC 
Ser 


CCA 
Ala 


TAT CTA 
Tyr Val 


TAT 
Tyr 


ACT 

Ser 


CAA 

Cln 


1200 
4 0U 


1201 
401 


ATA 
lie 


CCA CCT ACC AAC 
Aia Arg Thr Lys 


ACT 
Thr 


ATA 

llv 


ACT 
Ser 


»;at 

Asp 


CAA 

Clu 


TAC 
Tyr 


CTA 
Leu 


CAA 

Clu 


AAA 

Lys 


TAT 
Tyr 


CGA TTA 
Cly l.eu 


AAG 
Lys 


AAC 
Asn 


CTC 
Leu 


1 260 
420 



1261 CAA TAA 1266 
4.ri Ciu EfKi 422 
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, r.TC TCvT rxa TCC COC TTT CXG TTC CAO ATC CCC 

; c:: in ^ c-s r. :% - ... ... -.c aw 



60 
30 

120 



1% t^: [To :rn n. x.p .v. ^n. v-. .r, ..p 

-.^r t— t: CCC CAC CAC CCC ATA AAC AAC TA- 1*0 

izti'T.^.T,^. ™ ;r. '^li - ... ^ ^ » 

^ f^t: CCC AUA CAC CTC CCrr etc AAC CTT TAC AOC XTT 3A0 

^n:^r.t^«~.s;i:ii"i:;«..^o.,. .. 

n-; t" IK ts T, tn in s :s s s ?n s ^ ^ 5S? 

/-w iTA CCC T»r TW: CGC CGC CTT XTA OAO CAC CTC ACC 420 

IV 5" ti; s ss: Its ^> ?s ^. « «" « »• 
!s s tii s s s tj: s s s ir. IT, tr. s s ir, 

^ f-Er w r-r AAO TAC ceo CCC TAC ATC C» AAC OCA CTC COS WC CTC 

Ml CAC ACC CTC CTC CAC TTC CCC AAO TAC ^cr> ^ ^ ^ ^ 

m Clu S«r Val Vftl Clu Pho Ala LyS Tyr At* aVa Tyr ii* axa « 

• .w^-«r-»«f-xrrTTCAACCAGCCCAWTOOrr<TOGACCTCC« 

,»«f- eec OAC CCC cca aac ctc atc ctc 
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TBCRMOCOCCTTS AEDZIiaRA GLYCOSIDASE llBB/0> 
CONPLSTS QXm SEOUniCE - 9/95 

I ATC ATC CAC TCC CCC CTT AAA CCC ATT ATA TCT CAC CCT CCC CCC ATA ALX ATC ACA ATA 60 

I Met lie His Cya Pro Vai Lys Cly lie lie Ser CXu Ala Arg Civ 1 U* Thr He Thr lie 20 

61 CAT TTA ACTT TTT CAA CCC CAA ATA AAT AAT TTC CTC AAT CCT ATC ATT CTC TTT CCC CAC \70 

21 Asp Leu Ser Phe Cln Cly Gin lie Asn A$n i*eo VaI Asn Ala Het He val Phe Pro Clu «0 

121 TTC TTC CTC TTT CCA ACC CCC ACA TCT TCT CAT CAC ATC CAC CCA CAT AAT AAA TGG AAC J80 

41 Phe Phe i*u Phe Cly Thr Ala Thr Ser Sor Mis Cln lie Glu Cly Asp Asn Lys Trp Asn 60 

181 CAC TCC TCC TAT TAT CAC CAC ATA CCT AAC CTC CCC TAC AAA TCC CCT AAA CCC TCC AAT 240 

61 Asp Trp Trp Tyr Tyr Clu Clu lie Cly Lys Leu Pro Tyr Lys Ser Cly Lys Ala Cys Asn 80 

241 CAC TCC CAC CTT TAC ACC CAA CAT ATA CAC CTA ATC CCA CAC CTC CCC TAC AAT CCC TAC 300 

61 His Trp Clu Leu Tyr Arg Clu Asp lie Clu Leu Met Ala Gin Leu Cly Tyr Asn Ala Tyr 100 

301 CCC TTT TCC ATA GAG TCC ACC CCT CTC TTC CCC CAA CAC CCC AAA TTC AAT CAA CAA CCC 360 

XOl Arg Phe Ser lie Clu Trp Ser hrg Leu Phe Fro Glu Glu Cly Lys Phe Asn Clu Clu Ala 120 

361 TTC AAC CCC TAC CCT CAA ATA ATT GAA ATC CTC CTT CAC AAC CCC ATT ACT CCA AAC CTT 420 

121 Phe Asn Arg Tyr Arg Clu lie lie Clu lie L«u Leu Clu Lys Cly He Thr Fro Asn Val 140 

421 ACA CTC CAC CAC TTC ACA TCA COG CTC TCC TTC ATC CCC AAC CCA CCC TTT TTC AAC CAA 480 

141 Thr Leu His His Fhe Thr Ser Fro Leu Trp Phe Met Axg Lys Cly Cly Fhe Leu Lys Clu 160 

481 CAA AAC CTC AAC TAC TCC CAC CAC TAC CTT CAT AAA CCC CCC CAC CTC CTC AAC CCA CTC 540 

161 Clu Asn Leu Lys Tyr Trp Clu Gin Tyr Val Asp Lys Ala Ala Clu Leu Leu Lys Cly Val 180 

541 AAC CTT CTA CCT ACA TTC AAC CAC CCC ATC CTC TAT CTT ATC ATC CCC TAC CTC ACA CCC 600. 

181 Lys Leu Val Al* Thr Phe Asa Clu Fro Met Val Tyr Val Met Met Cly Tyr Leu Thr Ala 200 

601 TAC TOG CCC CCC TTC ATC AAC ACT CCC TTT AAA CCC TTT AAA CTT CCC CCA AAC CTC CTT 660 

201 Tyr Trp Fro Fro Fhe He Lys Smr Fro Fhe Lys Ala Fhe Lys Val Ala Ala Asa Leu Leu 220 

661 AAC CCC CAT CCA ATC CCA TXT GAT ATC CTC CAT OCT AAC TTT CAT CTC COO ATA CTT AAA 720 

221 Lys Ala His Ala Met Als Tyr Asp He Leu His Cly Asn Fhe Asp Val Cly He V«l Lys 240 

721 AAC ATC CCC ATA ATC CTC CCT CCA ACC AAC AGA CAC AAA CAC CTA GAA CCT CCC CAA AAC 780 

241 Asn He Fro He Met Leu Fro Ala Ser Asn Arg Clu Lys Asp Val Clu Ala Ala Cln Lys 260 

781 CCC GAT AAC CTC TTT AAC TCO AAC TTC CTT GAT CCA ATA TOO ACC CGA AAA TAT AAA CCA $40 

261 Ala Asp Asn Leu Fhe Asn Trp Asn Fhe Leu Asp Ala Ha Trp Ser Cly tys Tyr Lys Cly 280 

841 CCT TTT CGA ACT TAC AAA ACT CCA CAA ACC CAT CCA CAC TTC ATA OCC ATA AAC TAC TAC »00 

281 Ala Phe Cly Thr Tyx Lys Thr Fro Clu Ser Asp Ala Asp Fhe He Cly He Asn Tyr Tyr 300 

901 ACA CCC ACC CAC CTA ACC CAT ACC TOG AAT CCC CTA AAC TTT TTC TTC GAT CCC AAC CTT 960 

301 Thr Ala Ser Glu Val Arg His Ser Trp Asn Pro Leu Lys Phe Phe Phe Asp Ala Lys Leu 320 

961 CCA CAC TTA ACC CAC ACA AAA ACA CAT ATC CCT TCC ACT CTC TAT CCA AAC CCC ATA TAC 1020 

321 Ala Asp Leu Ser Clu Arg Lys Thr Asp Mot Cly Trp Ser Val Tyr Pro Lys Cly He Tyr 340 

1021 CAA CCT ATA CCA AAC CTT TCA CAC TAC OCA AAC CCA ATC TAC ATC ACC CAA AAC CCC ATA lOBO 

341 Clu Ala He Ala Lys Val Ser His Tyr Cly Lys Fro Met Tyr He Thr Clu Asn Cly He 360 

lOBl CCT ACC TTA CAC GAT CAC TOO ACC ATA CAC TTT ATC ATC CAC CAC CTC CAC TAC CTT CAC 1140 

361 Ala Thr Leu Asp Asp Glu Trp Arg He Clu Phe He He Cln His Leu Cln Tyr Val His 380 

1141 AAA CCC TTA AAC CAT OCC TTT CAC TTC AGA CCC TAC TTC TAT TCC TCT TTT ATC CAT AAC 12O0 

381 Lys Ala Leu Asn Asp Cly Phe Asp Leu Arg Cly Tyr Fhe Tyr Trp Sor Phe Met Asp Asn 40O 

1201 TTC CAC TOO CCT CAC CCT TTT AGA CCA CCC TTT CCC CTC CTC CAC CPO CAC TAC ACO ACC 1260 

401 Phe Clu Trp Ala Glu Cly Fhe Arg Fro Arg Phe Cly Leu Val Clu Val Asp Tyr Thr Thr 420 

1261 TTC AAC ACC ACA CCC AGA AAC ACT CCT TAC ATA TAT CCA GAA ATT CCA ACC GAA AAO AAA 1320 

421 Phe Lys Arg Arg Pro Arg Lys Ser Ala Tyr He Tyr Cly Clu He Ala Arg Clu Lys Lys 440 

1321 ATA AAA CAC CAA CTC CTC CCA AAC TAT CCC CTT CCC CAC CTA TCA 1365 

44 1 He Lys Asp Clu Leu Leu Ala Lys Tyr Cly Leu Fro Olu Leu End 455 
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THERMOCOCCUa CEITONOPHACUS GLTCOSZDASE - 220 
COMPX^ETC SEQUENCE - 9/95 

1 TTC CTT CCA GAG AAC TTT CTC TOO CCA CTT TCA CAC TCC CCA TTC CAC TTT CAA ATC CCC 60 

1 Het Ueu Pro Clu Asn Phe Leu Trp Cly Val Ser Cln Ser CXy Phe Gin Phe Clu Met Cly 20 

61 CAC ACA CTG ACG ACC CAC ATT CAT CCA AAC ACA CAT TCC TCC TAC TCC CTA ACA CAT CAA 120 

21 Asp Arg i-eu Arg Arg His lie Asp Pro Asn Thr Asp Trp Trp Tyr Trp Val Arg Asp Clu 4 0 

121 TAT AAT ATC AAA AAA CCA CTA CTA ACT GCC CAT CTT CCC CAA CAC CCT ATA AAT TCA TAT IBO 

Al Tyr Asn lie Lys Lys Cly Leu Val Ser Cly Asp Leu Pro Clu Asp Cly lie Asn Ser Tyr 60 

IBl CAA TTA TAT CAC ACA GAG CAA CAA ATT CCA AAC GAT TTA GGC CTC AAC ACA TAT ACG ATC 210 

61 Clu Leu Tyr Clu Arg Asp Gin Clu lie Ala Lys Asp Leu Cly Leu Asn Thr Tyr Arg lie 80 

241 CGA ATT GAA TCC ACC ACA CTA TIT CCA TCC CCA ACG ACT TTT CTC GAC CTC CAC TAT CAA JOG 

Bl Cly He Clu Trp Ser Arg Val Phe Pro Trp Pro Thr Thr Phe Val Asp Val Clu Tyr Clu 100 

301 ATT CAT GAC TCT TAC CCC TTC CTA AAC CAT CTC AAC ATT TCT AAA CAC CCA TTA CAA AAA 360 

101 He Asp Clu Ser Tyr Cly Leu Val Lys Asp Val Lys He Ser Lys Asp Ala Leu Clu Lys 120 

361 CTT GAT GAA ATC OCT AAC CAA ACG CAA ATA ATA TAT TAT ACC AAC CTA ATA AAT TCC CTA 420 

121 Leu Asp Clu He Ala Asn Cln Arg Clu He He Tyr Tyr Arg Asn Leu He Asn Ser Leu 140 

421 AGA AAC ACC CCT TTT AAC CTA ATA CTA AAC CTA AAT CAT TTT ACC CTC CCA ATA TCC CTT 480 

141 Arg Lys Arg Cly Phe Lys Val He Leu Asn Leu Asn His Phe Thr Leu Pro He Trp Leu 160 

481 CAT CAT CCT ATC CAA TCT AGA CAA AAA CCC CTC ACC AAT AAC ACA AAC CGA TCC CTA ACC 540 

161 His Asp Pro He Clu Ser Arg CXu Lys Ala Leu Thr Asn Lys Arg Asn Cly Trp Val Ser 180 

541 GAA ACC ACT CTT ATA GAC TTT CCA AAA TIT CCC CCC TAT TTA CCA TAT AAA TTC CGA CAC 600 

181 Clu Arg Ser Val He Clu Phe Ala Lys Phe Ala Ala Tyr X^u Ala Tyr Lys Phe Cly Asp 200 

601 ATA CTA GAC ATO TCC ACC ACA TTT AAT CAA CCT ATC CTC CTC CCC CAC TTC CCC TAT TTA 660 

201 He Val Asp Het Txp Ser Thr Phe Asn Clu Pro Met Val Val Ala Clu Leu Cly Tyr Leu 220 

661 CCC CCA TAC TCA CCA TTC CCC CCC CCA CTC ATC AAT CCA GAA CCA CCA AAC TTA CTT ATC 720 

221 Ala Pro Tyr Ser Cly Phe Pro Pro Gly Val Het Asn Pro Clu Ala Ala Lys Leu Val Met 240 

721 CTA CAT ATC ATA AAC CCC CAT CCT TTA CCA TAT AGC ATO ATA AAC AAA TTT GAC ACA AAA 780 

241 Leu His Met He Asn Ala His Ala Leu Ala Tyr Arg Met He Lys Lys Phe Asp Arg Lys 260 

781 AAA CCT GAT CCA GAA TCA AAA GAA CCA CCT CAA ATA CGA ATT ATA TAC AAT AAC ATC CCC 840 

261 Lys Ala Asp Pro Clu Ser Lys Clu Pro Ala Clu He Cly He He Tyr Asn Asn He Cly 280 

841 CTC ACA TAT CCC TTT AAT CCC AAA GAC TCA AAC CAT CTA CAA CCA TCC GAT AAT CCC AAT 900 

281 val Thr Tyr Pro Phe Asn Pro Lys Asp Ser Lys Asp Leu Cln Ala Ser Asp Asn Ala Asn 300 

901 TTC TTC CAC ACT CCC CTA TTC TTA ACG CCT ATC CAC ACG CGA AAA TTA AAT ATC GAA TTT 960 

301 Phe Phe His Ser Cly Leu Phe Leu Thr Ala He His Arg Cly Lys Leu Asn He Clu Phe 320 

961 CAC CGA GAG ACA TTT CTT TAC CTT CCA TAT TTA AAC CCC AAT CAT TCC CTC CGA CTC AAT 1020 

321 A^ Cly Clu Thr Phe Val Tyr Leu Pro Tyr Leu Lys Cly Asn Asp Trp Leu Cly Val Asn 340 

1021 TAT TAT ACA AGA CAA CTC CTT AAA TAC CAA CAT CCC ATC TTT CCA ACT ATC CCT CTC ATA 1080 

341 Tyr Tyr Thr Arg Clu Val Val Lys Tyr Cln Asp Pro Het Phe Pro Ser He Pro Leu He 360 

1081 ACC TTC AAC CCC CTT CCA CAT TAT CGA TAC CCA TCT AGA CCA CCA ACC ACG TCA AAC CAC 1140 

361 Ser Phe Lys Gly Val Pro Asp Tyr Cly Tyr Cly Cys Arg Pro Cly Thr Thr Ser Lys Asp 380 

1141 CCT AAT CCT CTT ACT CAC ATT CCA TCC GAG CTA TAT CCC AAA CCC ATC TAC CAC TCT ATX 1200 

381 Cly Asn Pro Val Ser Asp He Gly Trp Clu Val Tyr Pro Lys Cly Met Tyr Asp Ser He 400 

1201 CTA CCT CCC AAT CAA TAT CGA CTT CCT CTA TAC CTA ACA GAA AAC CCA ATA CCA GAT TCA 1260 

401 Val Ala Ala Asn Clu Tyr Cly Val Pro Val Tyr Val Thr Clu Asn Gly He Ala Asp Ser 420 

1261 AAA CAT CTA TTA ACC CCC TAT TAC ATC CCA TCT CAC ATT CAA CCC ATC GAA CAC CCT TAC 1320 

421 Lys Asp Val Leu Arg Pro Tyr Tyr He Ala Ser Mis He Clu Ala Met Clu Clu Ala Tyr 440 
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1 y2l CAA AAT OCT TAT 

441 Giu Asn Cly Tyr 

13B1 CCC TTA GCC TTC 

461 Ala Leu Cly Phe 

1441 AAA CCC AGC AAA 

481 Lys Pro Arg Lys 

ISOl ACC AAC ATC ACG 

SOI Set Asn lie Arg 



CAC CTC ACA CCA TAC TTA 
Asp Val Arg Gly Tyr Leu 

ACA ATC ACC TTT CCC TTC 
Arg Hec Arg Phe Gly Leu 

AAC ACT CTA ACA GTA TTC 
Lys Ser Val Arg Val Phe 

AAA CAG ATC TTA CAC CAC 
Lys CI. u lie Leu Clu Clu 



CAC TCC CCA TTA ACC OAT 
His Trp Ala Leu Thr Asp 

TAC GAA CTA AAC TTC ATA 
Tyt Clu Val Asn Leu lie 

ACA CAC ATA CTT ATT AAT 
Arg Clu lie Val lie Asn 

CCC TAC 1536 
Cly End 512 



AuAT TAC GAA TCC I IHO 

Asn Tyr Clu Trp 460 

ACC AAA CAC ACA 1440 

Thr Lys Clu Arg 480 

AAT CCC CTA ACA 1500 

Asn Cly Leu Thr 500 
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i 
1 

€1 



121 
41 

181 

61 

241 



481 

lei 

543 



7BI 
261 

841 



961 

321 

1021 
341 

loei 

361 

1141 
381 

1201 



vnnuoooocoa wuosua aLTCoszxsAAS * 7oi 
cGMSxm Gaon BMouaa - 10/95 



A.TG TTC CCT GAA AAG TTC CTT TGG CGT CTG GCA CAA TCG GGT TTT CAC TTT GAA ATG CCG 
Met Phe Pro Giu Lys Phe Leu Txp Gly Val Ala Gin 5cr Cly Phe Gin Phe Glu Met Ciy 



CAT GAT CCC ATT CAX; GCT ACG C^LC AGG GCG TTA ACT AAT AAG ACC AAC GCC TCG CTT AAC 
K13 A9P Pro lie Glu Ala Arg GXu Arg Ala Lau Thr Aan Lys Arg Asn Cly Trp Val Asn 

CCA AGA ACA GTT ATA GAC TTT GCA AAC TAT CCC GOT TAC ATA CCC TAT AAG TTT GGA CAT 
Pro Arg Thr Val !!• Clu Phe Ala Lys Tyr Ala Ala Tyr Ila Ala Tyr Lys Phe Cly Asp 



11^ f" 5^ ^"^^ ^ ™ CCT TAX AGC GAG ATA AAC AAC TTT GAC ACT GAG 



" j-iM-vp wv^ Akj.^ ^WU*' TTT ACT GkAG 

Xla Asn AI. His Ala Lau Ala Tyc Arg Cln 11a Lys Lys Pha Asp Thx Glu 



to 

20 



GAT AAA CTC AGC ACG AAT ATT CAC ACT AAC ACT GAT TGG TCG CAC TCG CTA ACC GAT AAC 120 

2. Asp Lya Lau Arg Axg Asn lie Asp Thr Asn Thr Asp Trp Trp His Trp Val Arg Asp Lya 40 

ACA AAT ATA GAG AAA CCC CTC CTT ACT GGA GAT CTT CCC GAC GAC CCC ATT AAC AAT TAC !HO 

41 Thr Asn II* Glu Lys Gly Leu Val Ser Ciy Asp Leu Pro Glu Glu Cly lie A^^ lyz - 



60 

240 

80 



CAC CTT TAT GAC AAG GAC CAT GAC ATT CCA AGA AAG CTC GGT CTT AAT CCT TAC AGA ATA 
61 Glu Lau Tyr Clu Lys Asp His Glu He Ala Arg Lys Leu Cly Leu Asn Ala Tyr Arg lie 

41 CCC ATA GAG TCG ACC AGA ATA TTC CCA TCG CCA ACG ACA TTT ATT GAT CTT GAT TAT ACC 300 

Bl Giy He Clu Trp Ser Arg He Phe Pro Trp ^ro Thr Thr Phe He Asp Val A^p ly] S« iSo 

l^^ ^ '^^^ ^^"^ ^' ^" ^"^^ ^ ^TC ACC AAG GAC ACT TTC GAC GAC 360 

101 Tyr Asn Clu Ser Tyr Asn Leu He Glu Asp Val Lys He Thr Lys Asp Thr Leu Clu Ctu 120 

361 TTA GAT GAG ATC GCC AAC AAG AGG GAG CTC GCC TAC TAT ACC TCA GTC ATA AAC ACC C7G 420 

121 Leu Asp Glu He Ala Asn Lys Arg Glu Val Ala Tyr Tyr Arg Ser Val lie Aan Ser L^u ill 

421 AGG AGC AAG GGG TTT AAC CTT ATA CTT AAT CTA AAT CAC TTC ACC CTT CCA TAT TGG TTC 4fl0 

141 Arg Ser Lys Cly Phe Lys Val He Val Asn Leu Asn Kis Ph. Thr Ju pS Tyr 1% "u ill 



S40 
L80 



I'l ' — '^^^ A^-r «rr TAC ATA GCC TAT AAG TTT GGA GAT 600 

181 Pro Arg Thr Val He Clu Phe Ala Lys Tyr Ala Ala Tyr He Ala Tyr Lys Phe «y A^J 200 

€01 ATA CTG GAT ATG TGG AGC ACC TTT AAT GAC CCT ATG CTG CTT CTT GAC CTT GCC TAC CTA 660 

201 Ha val Asp Het Trp Ser Thr Phe Asn Clu Pro Mec Val Val Val Clu Leu Cl^ Hu 220 

tt^ l^^ ^ ^^'^ CCA GGG GTT CTA AAT CCA GAC GCC GCA AAC CTC CCG ATA 720 

221 Ala Pro Tyr Ser Ciy Phe Pro Pro Gly Val Leu Asn Pro Clu Ala Ala Lys Leu Ala He 240 



780 
260 

AAA CCT CAT AAG GAT TCT AAA GAC CCT CCA GAA GTT GGT ATA ATT TAC AAC AAC ATT GGA B40 
261 Lys Ala Asp Lys Asp Ser Lys Clu Pro Ala Glu Val Cly Ha He Tyr Asn A^ ^ 280 

300 



2fil' vl? V^'^ l^"" ^ GAr CCG AAC GAT TCC AAC GAT CTT AAC GCA GCA GAA AAC GAC AAC 900 

261 Val Ala Tyr Pro Lys Aso Pro Asn Asp Ser Lys Asp Val Lys Ala Ala Glu Asn A^ A^n 300 

^ TTC TTC GAC CCC ATA CAC AAA GGA AAA CTT AAT ATA GAC TTT 960 

301 Ph. Phe HI, 5er Cly Leu Phe Phe Clu Ala He His Lys Cly Lys Lau Asn He Glu Phe 320 



321 Asn cTv cVH 7i Z * 17^ TAT CTA AAC CGC AAT GAC TCG ATA GGG GTT AAT 1020 

321 Asp Gly Clu Thr Phe He Asp Ala Pro Tyr Leu Lys Gly Asn Asp Trp He Gly Val Asn 340 

^4] «A CTA CTT ACG TAT CAC CAA CCA ATC TTT CCT TCA ATC CCG CTC ATC lOBO 

341 Tyr Tyr Thr Arg Clu Val Val Thr Tyr Cln Giu Pro Met Phe Pro Ser He Pro Leu lie 360 

C TTT AAG GGA GTT CAA CGA TAT CCC TAT GCC TGC ACA CCT GGA ACT CTC TCA AAC GAT 1140 

361 Thr Ph. Ly. Cly Val Gin Cly Tyr Cly Tyr Ala Cys Arg Pro Cly Thr Leu Ser Ly^ aS 380 

^ ^ AGC GAC ATA GGA TGG GAA CTC TAT CCA GAC CCG ATC TAC GAT TCA ATA 1200 

381 Asp Arg Pro Val Ser Asp Ha Gly Trp Clu Leu Tyr Pro Clu Cly Met Tyr Asp Ser He 400 

ioi CAC AAG TAC CGC CTT CCA CTT TAC CTC ACG GAG AAC GGA ATA CCG GAT TCA 1260 

401 Val C.u Ala His Lys Tyr Ciy Val Pro Val Tyr V*l Thr Clu Asn Gly He Ala A,p Ser 420 
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1261 WkH GAC ATC CTA XGA CCT TAC TAC ATA CCG XGC CAC ATK AAG ATG ATA GAG AAG GCC TTT 1320 

47.1 Lys A3P lie Leu Axg Pro Tyr Tyr 11* Ala Ser His lie Lys Met lU Giu Lys Aia Phe 440 

1321 GAG CAT CCG TAT CAA CTT AAC GCC TAC TTC CAC TGC GCA TTA ACT GAC AAC TTC GAC TGG 1380 

441 Ciu A3P GXy T/r Glu Val Lya Gly Tyr Pbe Hi a Trp Ala L«u Thr Aap Asn Phe Clu Trp 460 

1381 CCT CTC CGC TTT AGA ATG CCC TTT GCC CTC TAC CAA GTC AAC CTA ATT ACA AAC GAG AGA 1440 

461 Aia L^u Gly Phe Arg Hmz Arg Phe Cly L*u Tyr Glu V.l Aan Leu lie Thr Ly» Clu Arg 480 

144 1 ATT CCC AGO GAG AAG AGC GTG TCG ATA TTC AGA GAG ATA GTA GCC AAT AAT GGT GTT ACG iSOO 

481 lie Pro Arg Giu Lys S«x V*l S«r lie Phe Arg Giu lie VaI AIa Ajn Asn Gly Val Thr SCO 

1501 AAA AAG ATT GAA GAG GAA TTC CTG AGG GGA TGA 1533 

SOI Lys Lys He Giu Clu Glu Ucu Leu Arg Gly End 511 
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9 IB 27 36 45 54 

5' ATG ACA ATA CGT TTA GCG ACQ CTC GCG CTC TGC CCA GCC CTG AC5C CCA CTC ACC 
Met Ary He Arg Veu Ala TUT hQ\x Ala Leu Cy3 Ala Ala Lou Sor Pro Val Thr 

" 72 il 90 99 108 

TTT CCA CAT AAT GTA ACC CTA CAA ATC GAC GCC GAC GCC GGT AAA AAA CTC ATC 
Fh« Ala Asp Aan Val Thr Vol Oln Ho Asp Ala Asp Cly Cly Lya i.y« Ila 

"7 "fi 135 144 153 162 

ACC CGA GCC CTT TAC COC ATC AAT AAC TCC AAC CCA CAA ACC CTT ACC GAT ACT 
Sar Arg Ala Tyr Gly Mac Asn Ara Ser Asn Ala Glu Ser Leu Thr Asp Thr 

171 180 189 198 207 216 

GAC TOO CAQ CGT TTT CCC GAT CCA COT GTG CGC ATO CTQ CGG GAA AAT GCC GCC 
Abp Trp Gin Arg Pbe Arg A^p Ala Gly Val Arg Hat Lau Arg Glu Asn Gly Gly 

"5 234 243 252 261 370 

AAC AAC AGC ACC AAA TAT AAC TGG CAA CTG CAC CTG ACC ACT CAT CCG GAT TCG 
A«n Aan Ser Thr Lys Tyr Asn Trp Gin Leu His Leu Ser Ser His Pro Abp Trp 

279 288 297 306 315 324 

TAC AAC AAT CTC TAC GCC GCC AAC AAC AAC TCG OAC AAC COG GTA GCC CTO ATT 
Tyr Asn Aim Val Tyr Ala Gly Aan Asn Aan Trp Asp Asn Arg Val Ala Lau Ila 

333 342 351 360 369 378 

CAG GAA AAC CTC CCC CGC GCC GAC ACC ATC TCC CCA TTC CAC CTC ATC CGT AAC 
Gin Glu ABn Leu Pro Gly Ala Asp Thr Met Trp Ala Pha Gin Leu lie Cly l.yfl 

^•7 396 405 414 423 432 

GTC GCG GCG ACT TCT GCC TAC AAC TTT AAC GAT TGG GAA TTC AAC CAQ TCG CAA 
Val Ala Ala Thr Ser Ala Tyr Asn Pha Aszi Asp Trp Glu Pha Asn Gin Sor Gin 

*50 459 46B 477 486 

TCG TGG ACC CGC GTC OCT CAG AAT CTC OCT CGC GCC GOT GAA CCC AAT CTO CAC 
Trp Trp Thr Cly Val Ala Gin Asn Leu Ala Gly Gly Gly Glu Pro Asn Leu Asp 

«5 504 513 522 531 540 

GGC GGC GGC GAA GOG CTG GTT GAA GGA GAC CCC AAT CTC TAC CTC ATO GAT TGQ 
Cly Gly Cly Glu Ala Uki Val Clu Cly Asp Pro Asn Lou Tyr Lau Mac Asp Trp 

5*5 S58 567 576 585 S94 

TCG CCA GCC GAC ACT GTG GGT ATT CTC GAC CAC TGC TTT CGC GTA AAC CCC CTC 
Ser Pro Ala Asp Thr Val Cly lie Leu Asp His Trp Phe Gly Val Asn Gly Leu 

603 612 621 630 639 648 

CCC CTO CGG CGT OCC AAA OCC AAA ThC TGG AGT ATG GAT AAC GAG CCC GCC ATC 
Gly Val Arg Arg Gly Lys Ala Lyg TVx Trp Sor Mot Asp Asn Glu Pro Gly He 

666 675 684 693 702 

TGCGTTGCCACCCACGACGATaTAGTGAAAGAACAAACGCCGCrrAaAAGATTO: 
Trp Val Gly Thx His Asp Asp Val Val Lys Clu Gin Thr Pro Val Glu Asp Pho 

Figure 9 
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Baalcia gouldi •ndoglttcajM** (STwrl) (ooacl&ttcd) 

711 720 729 73B 747 756 

CTO CXC ACC TAT TTC (»A ACC GCC AAA AAA GCC CCK: GCC AAA TTT CCC GOT ATT 

Leu His Thr Tyr Bb« Glu Thr Ala Lvb Lya Ala Aro Ala l.y8 P»u» Pro Cly Il« 

765 774 7B3 793 801 810 

AAAATCACXCGTKCCTCCCCCCTAATGAGTGOCACTGGTATGCCTGGGGCOGT 

Ly» Ila Thr Oly Pro Val Pro Ala Aan Glu Trp Gift Trp Tyr Ala Trp Gly Gly 

BIS 828 837 846 855 864 

TTC TCC CTA CCC CAG GAA CAA GOO TTT ATO K3C TOG ATG GAG TAT TTC ATC AAC 
Ptie Ser Val Pro Gin Glu Gin Cly PIxa Mat S«r Trp Mat Glu Tyr Pho Ila Ly* 

873 882 B9t 900 909 918 

GGOCTGTCTGAAGAOCAACGCCCAACTOCTCTTCGCCTCCTCaATGa-ACTCGAT 

Arg Val Ser Glu Glu Gin Arg Ala Sor Gly Val Arg Oau Lwa AiV Val Aap 

927 936 945 954 963 972 

f-m exff TAC TAG CCC GGC GCT TAC AAT GCG GAA GAT ATC GTG CAA TTA CAT CGC 
Su Hi. Tyr Tyr pro Gly Ala Tyr Aan Ala Clu Asp Ilo Val Gin i-« Hi. Arg 

981 990 999 1008 1017 1026 

ACGTTCTTCGACCGCGACWTGrrPTCAeiCGATGCCAACCGaGTGAAAATOGTA 

Tl>r Phe Pba A«p Arg Aap Pba Val Sar Lau Asp Ala Aan Gly Val Vya Het Val 

1035 1044 1053 10S2 lOll 1080 

OAAOCTaGCTOOOATGACAOCATCAACAAQGAATATArrTTCGCGCGAGfTCAAC 

Olu Gly Oly Trp A«p Asp S«r Ila A«i Ly» Glu Tyr Ilo Ph« Gly Arg V«l A«» 

1089 1098 1107 1116 1125 1134 

GAT TGG CTC GAG CAA TAT ATO OOO CCA GAC CAT GOT 6TA ACC era GGC TTA ACC 

^p Trp Leu Glu Glu Tyr Met Gly Pro A«p Hie Cly Val Tto X*u Gly U.u Thr 

1143 1152 11" ll'O ^^"^ 

GAA ATG TCC OIO CGC AAT OTO AAT CCO ATG ACT ACC GCC ATC TGO TAT GCC TCC 

Glu Mot cy» val Arg A«n Val Aan Pro M-c Thr Wir Ala Ilo Trp Tyr Ala Sar 

1197 1206 1215 1224 1233 1342 

ATC CTC GGC ACC TIC GCG »T AAC CCC CTC CAA ATA TTC ACC CCA TGO TCC T« 

net Gly Thr Pho Ala Aap Aan Gly Val Olu lie phe Thr Pro Trp Cya Trp 

1251 1260 1269 1278 1287 1296 

AACACCGGAATCTGOGAAACACTCCACCTCTTCAGCCCCTACAACAAACCTTAT 

Aan Thr Gly Mot Trp Clu TJir Uou His Lau Pho Sor Arg Tyr Aan Lys Pro Tyr 

1305 1314 1323 1332 1341 1350 

CCOCTCGCCTCCAGCTCCAOTCTTGAAGAGTrPCTCACCCCCTACAGCTCCAyr 

Arg Val Ala Sor Sar Sar Sar Umu Olu Glu Phe Val Ser Ala Tyr Ser Ser lie 

1359 1368 1377 13B6 1395 1404 

AAC CAA CCA CAA GAC OCC ATO ACG OTA CTT CTO GTO AAT CGT TCC ACT ACC GAG 
A« Glu Ala ClU AOP Ala Mat TJU- val L«. La« Val Aan Arg Sor Tta S-r Glu 

Figure 9 (Continued) 
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1521 
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1566 
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Asn Ala Ijmu 
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thr Val 
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1575 




15B4 


1593 




1602 


1611 






TTO CCC OCT 


CTG 


TCC GTT 


ACT GCA ATA 


TTG 


CTC AAG GCC 


CGO CCC 


TAA 


3* 


Leu Pro Pro 


LfOU 


Ser Val 


Tbr Ala Ila 


Leu 


Leu Lys Ala 


Arg Pro 


*** 





Figure 9 (Continued) 



BNSCXXID: <WO 972541 7A1_L> 



wo 97/25417 



17/33 



PCT/US97/00092 



77i#*rvic>( Of/a maxiLisoA Alphn-QA Intrtusidaoc 
Convlwt«f Cone .S«<7»icjx-:n ^ ^ ^. 



9 Ifi 27 ^0 
CTC ATC TCT CTC OXA ATA TPC GGA AAC ACC TTC AHA 


GAG 


OCA 


4!i 
AGA 


ITC 


GTT 


54 

CTC 


Vai lie Cyc Val Clu Tie Phc Gly bya "Hit Rie Arg 


Clu 


Cly Arg 


Pbe 


Vai 


i^u 


63 72 81 90 
AAA GAG AAA AAC TTC ACA CTT CAC TTC CCG OTC GAG 


AAG 


99 

ATA CAC CTT GGC 


108 

TOG 


Lya Glu Lys Asn Phe THr Val Clu Pbe Ala Val Clu Lya lift Kis lieu Cly Trp 


117 126 135 144 
AAO ATC TCC GGC AGG CTC AAG OCA ACT CCG GGA AQG 


CTT 


GAG 


153 

OTT CTT 


CGA 


162 
ACC 


Lyc lie Ser Gly Arg Val Lys Gly Ser Pro Gly Ax^ 


Leu 


Glu 


Val 


Leu 


Xr^ 


-nir 


171 180 189 198 
AAA GCA CCG GAA AAG GTA CTT GTG AAC AAC TCXJ CAG 


TCC 


TGG 


207 
GGA 


CCG 


TC3C 


216 
AGG 


Lys Ala Pro Glu Lys Vol L«u Val Asn Asn Trp Gin Scr Trp 


Gly 


Pro 


Cys 


Arg 


225 234 243 252 
GTC GTC GAT OCC TTT TCT TTC AAA CCA OCT GAA ATA GAT COS 


261 

AAC 


TOG 


AGA 


270 
TAC 


Val Val Asp Ala Phe S«r Ly» Pro Pro Clu lie Aap 


Pro 


Asn 


Trp 


Ar^ 


Tyr 


279 288 297 306 
ACC GCT TCG GTG GTS CCC GAT GTA CTT GAA AGO AAC CTC CAG 

Thr Ala Ser Val Val Pro Asp Val Leu Clu Arg Asn Leu Gin 


315 
AGC 

Sex 


G9iC 
Asp 


IXT 

Tyr 


324 

TTC 

Phe 


333 342 351 360 
GTG GCT GAA GAA GGA AAA GTG TAG GGT TIT CTG ACT TCG 


AAA 


369 

ATC 


OCA 


CAT 


378 
CCT 


Val Ala Clu Glu Gly Lys Val Tyr Gly Kie Leu Ser Ser Lys, 


lie 


Ala 


His 


Pro 


387 396 405 414 

TIC TTC OCT GTG GAA GAT OGG GAA CTT GTG GCA TAC 


CTC 


GAA 


423 
TAT 


TIC 


GAT 


432 

GTC 


5^e Phe Ala Val Glu Asp Gly Glu Leu Val Ala TVr 


Leu 


Clu 


Tyr Phe Asp Val 


441 450 459 468 

GAG TTC GAC GAC TTT GCT CCT CTT GAA CCT CTC CrTT 


OTA 


CTC 


477 

GAG 


GAT 


CCC 


486 

AAC 


Glu Phe A3P Asp Phe Val Pro Leu Glu Pro Leu Val 


Val 


X>eu Glu Asp 


Pro 


A£n 


49S S04 513 522 

ACA CCC CTT CTT CTG GAC AAA TAC GCC GAA CTC GTC 


GGA 


ATC 


531 
GAA 


AAC 


AAC 


540 

OCC 


ntir Pro l^u l^u Leu Glu Lys Tyr Ala Glu Leu Val 


Cly 


net 


Glu 


Asn 


Asn 


Ala 


b49 558 567 576 
AGA GTT CXIA AAA CAC ACA CCC ACT GCA TCC TCC AOC 


TOG 


TAC 


585 
CAT 


TAC 


TTC 


594 

CTT 


Arg Val Pro Lyi; Uiu 'llir I'lO Ttiz Cly Trp Cys ser 


•J*tp 


lyr 


His 


Tyr 


Hie 


Leu 
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t^ttft mo tot/a marie iw Mriha-oalacLosidAoc 

603 612 621 630 639 540 

CAT CTC XrC TC3C GAA. CAC ACV CTC XAG AAC CTC AAC CTC OCG AAC AAT TTC CXrc 

A£jp i« ttir TrT> Glu Ciu Thr L^u Lyn Asn Ueu Lya I>«u Ala Lys Aon Phe P« 

657 666 675 684 693 702 

TTC GAG GTC TIC CAG ATA GAC GAC GCC TAC GAA AAC CAC XXA GCT GAC TCC CTC 

PtM Glu Val Pbe Gin lie Asp Asp Ala Tyx Glu Lys Asp lie Gly Acp Trp 

711 720 729 738 747 756 

Oro ACA AGA GGA GAC TTT CCA TCG GTG GAA GAG ATC OCA AAA CTTT ATA OCG GAA 

Val TSir Ary Gly Asp Phe Pro Scr Val Glu Glu Met Xla Lys Val He Ala Glu 

765 774 783 792 801 aiO 

AAC OCT TIC ATC CCO ©XT ATA TOG ACC GCC COG TIC ACT GTT TCT GAA AOC TCC 

Aflta Gly Phe He Pro Gly He Ttp Tbx Ala Pro rtae Scr Val Ser Glu Tbr Scr 

819 828 837 846 855 864 

GAT CIA TTC AAC GAA CAT CCQ GAC TOO GT3^ CTC AAG GAA AAC GGA CAC CCG AAC 

A*p Val nie Asn Glu His Pro Asp Trp Val Val Lys Glu Asn Gly Glu Pro Lyc 

873 882 89X 900 909 918 

ATC GCT raC AGA AAC TOG AAC AAA AAC KTK 1>C GCC CTC GAT CTT TOG AAA GAT 

Met Ala Tyr Arg Asn Trp Asn Lya liy« Zlo Tyr Ala I^eu Asp Leu Scr Ly» Aap 

927 936 945 954 963 972 

GAG GTT CTC AAC TGG CTT TIC CAT CTC TTC TCA TCT CTG AGA AAO ATC GCC rOlC 

Glu Val Leu Am Trp Leu Phe Asp L«u Phe Ser Scr Leu Ary Lys Met Gly Tyr 

981 990 999 1008 1017 1026 

AOG TAC TIC AAC ATC GAC TTT CTC TTC CCG CGT GCC CTT OCA GGA GAA AGA AAA 



Arg Tyr Phe Lys Il« Asp Phe Leu Phe Ala Gly Ala Val Pro Gly Glu Ar^ Lys 

1035 1044 1053 1062 1071 1O80 

AAC KTK ACA CCA ATT CAC CCG TIC AGA AAA OGC ATT GAG ACC ATC AGA AAA 

Lya Aan He Thr Pro He Gin Ala n>e Ar? Lys Gly He Glu Thr Ila Arg l-y» 

1089 1090 1107 1116 1125 1134 

GCC GTG GGA GAA GAT TCT TIC ATC CTC CGA TX3C OOC TCT CCC CTT CTT CCC CCA 

Ala Tol Gly Glu Asp Gcr Plie He Leu Gly Cys Gly S*tt Pro Leu Leu Pro Ala 

1143 IIW 1161 1170 1179 1188 

CTC GGA TCC CTC GAC OOC ATC AGO ATA OQA CCT CAC ACT OOC CCC TIC TOG GGA 

Val Gly Cys Vaj Asp Cly Met Arg He Gly Pro Aap Tkxx Ala Pro Phe Trp Oly 

Figure 10 (Continued) 
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1197 1206 12iS 1224 12.^3 1242 

GWl CAT ATA CyUi GAC AAC OTJV OCT CXC CCT CCA ACJi TOG OCG CTC AGA AAC GCC 

GXu His lift Giu Asp Asn Cly AIa !>ro Ala Ala Ary Trp Ala 1/oa Arg Asn Ala 

X2S1 1260 1269 1278 1297 129$ 

ATA ACG AOO TAG TTC ATC CAC GAC AOG TTC TGG CTO AAC GAC OOC mc TOT CTG 

lie Ar^ Tyr Pho Mac Hic Asp Arg Phe Trp l»eu Aon Asp Prx> Acp Cys Leu 

1305 1314 1323 1332 1341 1350 

ATA CTG AGA GAG GAC AAA ACC CAT CTC ACA CAG AAG GAA AAG GAG CTC TSIC TCG 

lie Leu Arg Clu Glu Lys Thr Asp Leu "nir Gin Lys Glu Lya Glu Leu Oyr Ser 

1359 1368 1377 1386 1395 1404 

TAG ACC TGT OGA GTG CTC GAC AAC ATC ATC AIA GAA ACC GAT GAT CTC TCG CTC 

ryv TSir Cya Cly Val Leu Asp Asn Met lie lie Clu Ser Acp Asp Leu Ser Leu 

1413 1422 1431 1440 1449 1458 

GTC AGA GAT CAT OGA AAA AAC CTT CTC AAA GAA ACG CTC «A CTC CTC OCT OGA 

Val Arg Asp His Gly Ly» Lys Val Leu Lys Glu Thr Leu Glu Leu Leu Gly Gly 

1467 1476 1485 1494 1503 1S12 

AGA OCA COG GIT CAA AAC ATC ATC TOG GAG GAT CTG AGA TRC GM3 ATC GTC TOG 

Arg Pro Arg Val Gin Asn lie Wet Ser Clu Asp leu Arg Tyr Glu He Val Ser 

1521 1530 1539 1548 1557 1566 

TCT GGC ACT CTC TCA OCA AAC GTC AAG ATC GTG CTC GAT CTO AAC AGC AGA GAG 

Ser Cly Tte Leu Ser Gly Asn Val Lys He Val val A»p l«u Asn Sex Arg Glu 

1575 1584 1593 1602 1611 1620 

TTJZ CAC CTG GAA AAA GAA GGA AAC TCC TCC CTG AAA AAA AGA OTC GTC AAA AGA 

Tyr Hxs Leu Glu Lys Clu Gly Lys Ser Ser Leu Lys Lys Arg Val Val Lys Arg 

1529 1638 1647 1656 1665 

GAA GAC GGA AGA AAC TTC TAC TTC TAC GAA CAC OCT CAG ACA GAA TGA 3 ' 

Glu Asp Gly Arg Asn Phe Tyr Phe Tyr Giu Giu Cly Glu Arg Glu 
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S« 558 567 576 5B5 594 

«C TCC ^ CTC CTTA AXC CAT GTC AAT XCC TAG ACG ^ ^ ^ 

Val Ser Phe ilu vll Zn 'hI'b vII J^n rii^ Tyr Thr Gly V»l Pro Tyr Arg Glu 

K03 612 621 "0 639 64B 

GAG CCC III ATC AtG GCC TGG GAG err GCA AAC GAA CCG CGC TCT GAG ACG GAC 

gIu pro Tta III Met Zl ^ gIu Leu Al« A«n Glu Pro Arg Cys Glu Thr Asp 

fiST 666 675 684 693 702 

,UJ.TCGSAACACGCTCGTTGAGTGCCtGAAOGMATCAGCT^ 

lyl ser oly Tte llu Vel gIu Trp Val Lys Glu Met Ser Ser Tyr He Ly. 



ACTCTGSicCCAAC^CTCCTGGCTGTGGGGGACGAAGGAW 
ser Uu Zp Zl Hi, Zl vll ^« val Gly Asp Glu Gly Phe Phe Ser A«. 

774 783 792 801 810 

TAC GAA gS TIC AAA CCT TAC GGT GGA 6AA GW TM GW T^ 

gIu «y IZ Zl ^ gI; hi hi AI^ du Ttp AI* Tyr A«» Cly Ttp 

B28 837 8*6 855 864 

TCCGGTGSGACTGOAAOAAGCTCCTTTCCATAGAOAreGTC 

Cly ill l^ ^ '^^ Zl Zl ill om Val a.p Ph* Cly Thr 
873 882 891 900 909 918 

rrc CAC CTC tat cco tcc cac tog ggt gtc act cca aac tat g^ 

^be III L2u ^ ^r HI ^ cly val Ser Pro Glu Asn Tyr Ala Gin Trp 

927 936 945 954 963 972 

GGA GCG AAG TOG ATA GAA GAC CAC ATA AAG ATC GCA AAA GAG ATC 

lly 111 11^ 'r^ III HI H^ HI III ly» !!• Ala Lys Glu He Cly Ly« Pro 

990 999 1008 1017 1026 

i GAA TAT GCA ATT CCA AAG AGT GCG CCA GTT AAC AGA ACG CCC 

wll wll Zl HI HI ^ Hy III pro lli, sir Ala Pro Val A.n Arg Thr Ala 

1044 1053 10" 1071 1080 

XTC TAC^^ CTC T«3 AAC GAT CTG GTC TAC OAT CTC GOT GGA GAT GGA 

lie Tyr Arg 



720 729 738 747 756 



981 

GTT GTT CTG 



1035 10*4 1053 1062 1071 1080 

CTC toa AAC GAT CTG GTC TAC OAT i 

Zl Hi HI HI HI HI ^ A.P Leu Gly Gly Asp Gly Ala Met 
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1089 1098 1107 1116 1135 1134 

TTC TGG ATG CTC CCG GGA ATC GGQ GAA GOT TCG CAC AGA GAC GAG AGA CXX; TAC 
Phe Txp Met Leu Ala Gly lie Gly Glu Gly Ser Asp Arg Asp Clu A^g Cly ^ 

"52 1161 .1170 1179 iiaa 

™ Tf. !-! ^ AAC GAC GAC ACT CCA GAA GCC c!I 

Tyr Pro A.p Tyr Asp Gly Phe Arg He v«l Asn i[Ip i[Ip ser Pro gIu ill iiu 
1197 1206 1215 1224 1233 i,,, 

!!!-!- ™ ™ !!!!!! '^'^ «^ <^ ATA ACA caa JI? 

Leu He Arg Olu Tyr Al« Ly. I^u Phe A.n Thx Gly Glu ^ Vll ciu ilp 

1278 1287 laog 

™ !!!!!!!!! ™ !!! ™ ^ ^""^ ^ Acc 

Thr Cya Ser Phe He Leu Pro Lys Asp Gly MeC Glu He LyI Lys ^ CIl Clu 
1305 1314 1323 1332 1341 1350 

!!?^???!!!?!?!!!r*^'^'^^^^*^'^'^*^'^'i"cncAAA 

Val Arg Al. Gly Val Phe Aap Tyr Ser Asa ^e olu Z.l ier vll L^ 

1359 1368 1377 1386 1395 1404 

™ ™ !!! ^ ^ °« A™ CAT CTC GQA TAC GGA ATT TAC 

Val Clu Asp Leu Val Phe Glu Asn Olu He Glu His L^ Gly cly He 

1413 1422 1431 1440 1449 iash 

GGC TTT CAT CTC GAC ACA ACC CGG ATC CCG GAT CGA GAA CAT GAA ATG TTC CTT 
Cly Phe Asp Leu Asp Thr Thr Arg He Pro Asp Cly Glu His Glu Met Phe Z^u 

1467 1476 1485 1494 1503 

™!!f™!!!!!!!?^^^!!!!!!:^"^^'^'«:^««AAA gtc^ 

Clu Gly His Phe Cln Gly Lys Thr Val Lys ^p s« He Zys vli vll 

1S21 1530 1539 1548 1557 , 

^!!!!^!!^!!!!!!!!!!!!!!?l^°*°*^"''^''»"^'CTccA gaa^ 

Asn Glu Ala Arg lyr Val Leu Ala Olu ZIu vll 1^ 

^V.l ^ ^^Vti «?i 

GAC 



Tyr 



GTC AAA AAC TCX; TGG AAC AGC CGA ACC TOG CAG CCA C3AC TTC GGC TCA OCT 



V*l Lys Aan Trp Trp Aan Ser Oly Thr Trp Gin AX. CXu Pha cly Ser i^p 
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T|&«rmocoa* Mrit:lM ft*Mnn>naa» cSfiA^ (continued) 

1629 163B 1647 1656 166S 1674 

ATT GAA TGG AAC GGT GAG GTG GGA AAT GGA GCA CTG CAG CTG AAC GTG AAA CTC 

Ilo Glu Trp Asn Oly Glu Val Gly Asn Gly Ala Leu Gin Leu Asn Val Lye Lou 

1683 1692 1701 1710 1719 1728 

CCC GGA AAG AGC GAC TGG GAA GAA GTO AGA GTA GCA AGG AAG TTC GAA AGA CTC 

Pro Gly Lys Ser Asp Trp Glu GXu Val Arg Val Ala Arg Lys Phe Glu Arg Leu 

1737 1746 1755 1764 1773 1782 

TCA GAA TGT GAG ATC CTC GAG TAC GAC ATC TAC ATT CCA AAC GTC GAG GGA CTC 

Ser Glu Cya Glu He Leu Olu Tyr Asp He Tyr He Pro Asn Val Glu Gly Leu 

1791 1800 1809 1818 1827 1836 

AAG GGA AGG TTO AGG CCQ TAC GCG GTT CTO AAC CCC GGC TGG GTG AAG ATA GGC 

Lys Oly Arg L^u Arg Pro Tyr Ala Val Leu Asn Pro Gly Trp Val Lys lie Gly 

1845 1854 1863 1872 1881 1890 

CTC GAC ATO AAC AAC GCG AAC CTC GAA ACT GCG GAO ATC ATC ACT TTC GGC GGA 

lAu Asp Met AM Asn Ala Asn Val Glu Ser Ala Olu He He Thr Phe Gly Gly 

1899 1908 1917 1926 1935 1944 

AAA GAG TAC AGA AGA TTC CAT GTA AGA ATT GAG TTC GAC AGA ACA GCG GGG GTG 

Lys Glu Tyr Arg Arg Phe His Val Arg He Olu Phe Asp Arg Thr Ala Gly Val 

1953 1962 1971 1980 1989 1998 

AAA GAA CTT CAC ATA GGA GTT GTC GGT GAT CAT CTG AGO TAC GAT GGA CCG ATT 

Lys Glu Leu His He Gly Val Val Gly Asp His Leu Arg Tyr Asp Gly Pro He 

2007 2016 2025 2034 2043 

TTC ATC GAT AAT GTG AGA CTT TAT AAA AGA ACA GGA GGT ATG TGA 3* 

Phe He Aap Aan Val Arg Leu Tyr Lys Arg Thr Gly Gly Met 
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AMMZZ Xm P-MA&ne«iAa9» (6 30B1) 
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XX 



la p.MBsealdaa* <t30Bl) (ooBblatt*d) 



S49 558 567 576 585 594 

AGO ACA GTT CTT GAG TTT CCC AAQ TAT OCT OCT TAC ATC GCC CAT GCG CTC CCSA 



Arg Thr Val Val 



Glu Phe Ala Tyr Aia Ala Tyr lie Ala His Ala Leu Gly 



603 612 621 MO 639 64 B 

GAC CTC CTG GAC ACA TCG AGC ACC TTC AAC GAA CCT ATG GTA OTT GTO GAG CTC 

rip Leu Val Asp Thr Trp Sar Thr Pbo Asn Glu Pro Met Val Val Val Glu Leu 

€57 666 675 684 693 702 

GGC TAC CTC GCC CCC TAC TCA GGA TTT CCC CCO GGA GTC ATG AAC CCC GAG GCC 

Gly TVr l*u Ala Pro Tyr Ser Gly Phe Pro Pro Gly Val Met Asn Pro Glu Ala 

711 720 729 738 747 756 

CCG AAG CTG GCO ATC CTC AAC ATC ATA AAC GCC CAC GCC TTG GCA TAT AAG ATG 

All L^ Leu Ala He Leu A«n Met He Aim Ala His Ala Leu Ala Tyr Lys Met 

7g5 774 783 792 801 810 

ATA AAG AGG TTC GAC ACC AAG AAG GCC GAT GAG GAT AGC AAG TCC CCT GCG GAC 

111 tog Pb« Asp Thr Lys Lys Ala Asp Glu Asp Ser Lys Ser Pro Ala Asp 

819 828 837 846 855 864 

GTT GGC ATA ATT TAC AAC AAC ATC OCT GTT GCC TAC CCT AAA GAC CCT AAC GAT 

vll Gly Tie lie Tyr Asn Asn He Gly Val Ala Tyr Pro Lya Asp Pro Asn Asp 

873 8B2 891 900 909 918 

CCC AAG GAC GTT AAAGCAGCCGAAAACGACAACTACTTCCACAGCGGACTGTTC 



Pro Lys Asp Val Lys 



Ala Ala Glu Asn Asp Asn Tyr Pbo His Ser Gly Leu Phe 



927 936 945 954 963 972 

TTTGATGCCATCCACAAGGGTAAOCrCAACATAGAaTTCCACGCCGAAAACTrT 

Phe A^i All lie His Lys Gly Lys Leu Asn Ha Glu Phe Asp Gly Glu Asn Phe 

981 990 999 1008 1017 1026 

OTA AAA err AGA CAC CTA AAA GGC AAT GAC TGG ATA GGC CTC AAC TAC TAC ACC 

Val L^ val Aro His Leu Lys Gly Asn Asp Trp lie Gly Leu Asn Tyr Tyr Thr 

1035 1044 1053 1062 1071 1080 

CGC GAC GTT CTT ACA TAT TCG GAG CCC AAO TTC CCA ACT ATA CCC CTC ATA TCC 

tog Glu vll Val tog Tyr Ser Glu Pro Lya Phe Pro Ser He Pro Lou He Ser 
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1089 1098 1107 1116 1125 II34 

TTC AAO GGC CTT CCC AAC TAC GGC TAC TCC TGC AGG CCC GGC ACO ACC TCC CCC 

Phc l,y» Gly Val Pro Aan Tyr Gly Tyr Ser Cys Aro Pro Cly Thr Thr Ser Ala 

1143 1152 1161 1170 1179 1188 

GAT GGC ATG CCC GTC AGC GAT ATC GGC TGG CAA GTC TAT CCC CAG GGA ATC TAC 



Asp Gly Het Pro Val Ser Asp lie Gly Trp Glu Val Tyr Pro Gin Gly !!• 



Tyr 



1197 1206 1215 1224 1233 1242 

GAC TCG ATA GTC GAG GCC ACC AAO TAC ACT OTT OCT GTT TAC GTC ACC GAG AAC 

Asp Ser lie Val Glu Ala Thr Ly« Tyr Ser Val Pro Val Tyr Val Thr Glu Asn 

1251 1260 1269 1278 1287 1296 

GCT GTT GCG GAT TCC GCG GAC ACC CTG AGG CCA TAC TAC ATA GTC AGC CAC GTC 

Gly Val Ala Asp Ser Ala Asp Thr X^u Arg Pro Tyr Tyr lie Val Ser His Val 

1305 1314 1323 1332 1341 1350 

TCA AAC ATA GAG GAA GCC ATT GAG AAT GGA TAC CCC GTA AAA GGC TAC ATG TAC 

Ser Lys lie Glu Glu Ala lie Glu Asn Gly Tyr Pro Val Lya Gly Tyr Met Tyr 

1359 1368 1377 1386 1395 1404 

TGG CCC CTT ACC CAT AAC TAC GAG TCG GCC CTC GGC TTC AGC ATG AGG TTT GOT 

Trp Ala l^u Thr Asp Asn Tyr Glu Trp Ala Leu Gly Pfaa Ser Met Arg Phe Gly 

1413 1422 1431 1440 1449 1458 

CTC TAC AAG GTC GAC CTC ATC TCC AAC GAC AGG ATC CCC AGG GAG AGA AGC GTT 



Leu Tyr Lys Val Asp Leu lie Ser Lys Glu Arg He Pro Arg Glu Arg 



Ser Val 



1467 1476 1485 1494 1503 1512 

GAG ATA TAT CCC AGG ATA CTG CAG TCC AAC GCT GTT CCT AAG GAT ATC AAA GAG 



Glu He Tyr Arg Arg He Val Gin Ser Asn Cly Val Pro Lys Asp He 

1521 1530 1539 

GAG TTC CTG AAG GGT GAG GAG AAA TGA 3* 



Lys Glu 



Glu Phe Leu Lys Gly Glu Glu Lys *** 
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9 18 27 36 45 S4 

ATG GTA GAA AGA CAC TTC AGA TAT GTT CTT ATT TGC ACC CTO TTT CTT UTT ATG 

Met Val Glu Arg His Phe Arg Tyr Val Leu lie Cys Thr I*eu Phe Leu Val Met 

€3 72 Bl 90 99 108 

CTC CTA ATC TCA TCC ACT CAQ TGT GGA AAA AAT GAA CCA AAC AAA AOA GTG AAT 

Leu Leu lie Ser Ser Thr Gin Cys Gly Lys Asn Glu Pro Asn Lys Arg Val Asn 

117 126 135 144 153 162 

AGC ATG GAA CAG TCA GTT GCT GAA ACT CAT AGC AAC TCA GCA TTT GAA TAC AAC 

Ser Het Glu Gin Ser Val Ala Glu Ser Asp Ser Asn Sex Ala Phe Glu Tyr Asn 

171 180 189 198 207 216 

AAA ATG GTA GGT AAA GGA GTA AAT ATT GGA AAT GCT TTA GAA GCT CCT TTC GAA 

Lys Met Val Gly Lys Gly Val Asn lie Gly Asn Ala Leu Glu Ala Pro Phe Glu 

225 234 243 252 261 270 

GGA GCT TGG GGA GTA AGA ATT GAG GAT GAA TAT TTT GAG ATA ATA AAG AAA AGG 

Gly Ala Trp Gly Val Arg lie Glu Asp Glu Tyr Pha Glu lie lie Lys Lys Arg 

279 288 297 306 315 324 

GGA TTT GAT TCT GTT AGG ATT CCC ATA AGA TGG TCA GCA CAT ATA TCC GAA AAG 

Gly Phe Asp Ser Val Arg lie Pro 11a Arg Trp Ser Ala His lie Ser Glu Lys 

333 342 351 360 369 378 

CCA CCA TAT GAT ATT GAC AGO AAT TIC CTC GAA AGA GTT AAC CAT GTT GTC GAT 

Pro Pro Tyr Asp lie Asp Arg Asn Phe Leu Glu Arg Val Asn His Val Val Asp 

387 396 405 414 423 432 

AGG GCT CTT GAG AAT AAT TTA ACA GTA ATC ATC AAT ACG CAC CAT TTT GAA GAA 

Arg Ala Leu Glu Asn Asn Leu Thr Val lie lie Asn Thr His His Phe Glu Glu 

441 450 459 468 477 486 

CTC TAT CAA GAA CCC GAT AAA TAC GGC GAT GTT TTG GTG GAA ATT TGG AGA CAG 

Leu Tyr Gin Glu Pro Asp Lys Tyr Gly Asp Val X^eu Val Glu lie Trp Arg Gin 

495 504 513 522 531 540 

ATT GCA AAA TTC TTT AAA GAT TAC CCG GAA AAT CTG TTC TTT GAA ATC TAC AAC 

lie Ala Lya Phe Phe LyB Asp Tyr Pro Olu Asn Leu Phe Phe Glu lie Tyr Aen 
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OC1/4V K&dooluoanas* (33GV1) ( coii^liiti»a> 

549 558 567 576 585 594 

GAG CCT GCT CAG AAC TTG ACA GCT GAA AAA TGG AAC GCA CTT TAT CCA AAA GTC 

GIu Pro Ala Gin Asn Leu Thr Ala Glu Lys Trp Asn Ala Lexi Tyx Pro Lys Val 

603 612 621 630 639 648 

CTC AAA GTT ATC AGG GAG AGC AAT CCA ACC CGG ATT GTC ATT ATC GAT GCT CCA 

Leu Lys Val lie Arg Glu Ser Asn Pro Ttir Arg He Val He He Asp Ala Pro 

657 666 675 684 693 702 

AAC TOG GCA CAC TAT AGC GCA GTG AGA ACT CTA AAA TTA GTC AAC GAC AAA CGC 

Aan Trp Ala His Tyr Ser Ala Val Arg Ser I^u Lys Leu Val Asn Asp Lys Arg 

711 720 729 738 747 756 

ATC ATT GTT TCC TTC CAT TAC TAC GAA CCT TTC AAA TTC ACA CAT CAG GGT CCC 

II© lie Val Ser Phe His Tyr Tyr Glu Pro Phe Lys Phe Thx His Gin Gly Ala 

765 774 783 792 801 810 

GAA TOG GTT AAT CCC ATC CCX CCT GTT AGG GTT AAG TGG AAT GOC GAG GAA TCC 

Glu Trp Val Asn Pro Ho Pro Pro Val Arg Val Lys Trp Asn Gly Glu Glu Trp 

819 828 837 846 855 864 

GAA ATT AAC CAA ATC AGA AGT CAT TTC AAA TAC GTG ACT GAC TGG GCA AAG CAA 

Glu He Asn Gin He Arg Ser His Phe Lys Tyr Val Ser Asp Trp Ala Lys Gin 

873 882 891 900 909 918 

AAT AAC CTA CCA ATC TTT CTT GGT GAA TTC GGT GCT TAT TCA AAA GCA GAC ATG 

Asn Asn Val Pro He Phe Leu Gly Glu Phe Gly Ala Tyr Ser Lys Ala Asp Met 

927 936 945 954 963 972 

GAC TCA AGG GTT AAG TGG ACC GAA AGT GTG AGA AAA ATG GCG GAA GAA TTT GGA 

A«p Ser Arg Val Lys Trp Tiir Glu Ser Val Arg Lys Mec Ala Glu Glu Phe Gly 

981 990 999 1008 1017 1026 

TTT TCA TAC GCG TAT TGG GAA TTT TGT GCA GGA TTT GGC ATA TAC GAT AGA TGG 

Phe Ser Tyr Ala Tyr Trp Glu Phe Cys Ala Gly Phe Gly He Tyr Asp Arg Trp 

1035 1044 1053 1062 1071 1080 

TCT CAA AAC TOG ATC GAA CCA TTG CCA ACA GCT GTG GTT GGC ACA CGC AAA GAG 

Ser Gin Asn Trp He Glu Pro Leu Ala Thr Ala Val Val Gly Thr Gly Lys Glu 

TAA 3* 

* • • 

Figure 13 (Continued) 
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Tbermotova Mritlma ^tLXXul«na«* (60r3) Iconciaued) 

5<9 558 567 576 585 594 

AAA AAC GGA GAA GAC ACA GAA CCG TAC CAG GTT GTG AAC ATG GAA TAC AAG GGA 

Lys Aan Gly Glu Aap Thr Glu Pro Tyr Gin Val Val Asn Met Glu Tyr Lys Gly 

603 612 621 630 639 648 

AAC GGG GTC TGC GAA GCG GTT GTT GAA GGC CAT CTC GAC GGA GTG TTC TAC CTC 

Aan Gly Val Trp Glu Ala Val Val Glu Gly Asp Leu Asp Gly Val Phe Tyr Leu 

657 666 675 684 693 702 

TAT CAG CTG GAA AAC TAC GGA AAG ATC AGA ACA ACC GTC GAT CCT TAT TCG AAA 

Tyr Gin Leu Glu Asn Tyr Gly Lys lie Arg Thr Thr Val Asp Pro Tyr Ser Lys 

711 720 729 738 747 756 

CCG CTT TAC GCA AAC AAC CAA GAG AGC GCC GTT GTG AAT CTT GCC AGO ACA AAC 



Ala Val Tyr Ala Asn Asn Gin Glu Ser Ala Val VaI Asn Leu Ala Arg Thr 



Asn 



7€5 774 783 792 801 810 

CCA GAA GGA TGG GAA AAC GAC AGG GGA CCG AAA ATC GAA GGA TAC GAA GAC GCG 

Pro Glu Gly Trp Glu Asn Asp Arg Gly Pro Lys lie Glu Gly Tyr Glu Asp Ala 

819 826 837 846 855 864 

ATA ATC TAT GAA ATA CAC ATA GCC GAC ATC ACA GGA CTC GAA AAC TCC GGG GTA 

lie lie Tyr Glu lie His lie Ala Asp lie Tfar Gly Leu Glu Asn Ser Gly Val 

B73 882 891 900 909 918 

AAA AAC AAA GGC CTC TAT CTC GGG CTC ACC GAA GAA AAC ACG AAA GGA CCG GGC 

Lys Asn Lys Gly Leu Tyr Leu Gly Leu Thr Glu Glu Asn Thr Lys Gly Pro Gly 

927 936 945 954 S63 972 

GGT GTG ACA ACA GGC CTT TCG CAC CTT GTG GAA CTC GGT GTT ACA CAC GTT CAT 

Gly Val Thr Thr Gly Leu Ser His Leu Val Glu Leu Gly Val Thr His Val His 

581 990 999, 1008 1017 1026 

ATA CTT CCT TTC TTT GAT TTC TAC ACA GCC GAC GAA CTC GAT AAA GAT TTC GAG 

lie Leu Pro Phe Phe Asp Phe Tyr Thr Gly Asp Glu Leu Asp Lys Asp Phe Glu 

1035 1044 1053 1062 1071 1080 

AAG TAC TAC AAC TOO GGT TAC OAT CCT TAC CTG TTC ATO CTT CCG GAG GGC AGA 

l-y» Tyr Tyr Asn Trp Gly Tyr Asp Pro Tyr Leu Phe Met Val Pro Glu Gly Arg 

Figure 14 (Continued) 
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1089 1098 1107 1116 1125 1134 

TAC TCA ACC GAT CCC AAA AAC CCA CAC ACG AGA ATC AGA GAA GTC AAA GAA ATC 

Tyr Ser Thr Asp Pro Ly» Asn Pro His Thr Arg lie Arg Glu Val Lys Glu H«t 

1143 1152 1161 1170 1179 1188 

GTC AAA GCC CTT CAC AAA CAC GGT ATA GGT GTG ATT ATG GAC ATG GTG TTC CCT 

Val Lya Ala Leu Hia Lys His Gly Ho Gly Val lie Het Asp Mec Val Pbe Pro 

1197 1206 1215 1224 1233 1242 

CAC ACC TAC GGT ATA GGC GAX CTC TCT GCG TTC GAT CAG ACG GTC CCG TAC TAC 

His Thr Tyr Gly He Gly Glu IJeu Ser Ala Phe Asp Gin Thr Val Pro Tyr Tyr 

1251 1260 1269 1278 1287 1296 

TTC TAC AGA ATC GAC AAG ACA OCT GCC TAT TTO AAC GAA AGC OGA TGT GGT AAC 

Pho Tyr Arg He Asp Lys Thr Gly Ala Tyr Leu Ash Glu Ser Gly Cys Gly Asn 

1305 1314 1323 1332 1341 1350 

GTC ATC GCA AGC GAA AGA CCC ATG ATG AGA AAA TTC ATA GTC GAT ACC GTC ACC 

Val He Ala Ser Glu Arg Pro Het Met Arg Lys Phe He Val Asp Thr Val Thr 

1359 1368 1377 1386 1395 1404 

TAC TGO GTA AAG GAG TAT CAC ATA GAC OGA TTC AGG TTC GAT CAG ATO GGT CTC 

Tyr Trp Val Lys Glu Tyr His Ho Asp Gly Pbe Arg Phe Asp Qla Met Gly Lou 

1413 1422 1431 1440 1449 1458 

ATC GAC AAA AAG ACA ATG CTC GAA CTC GAA AGA GCT CTT CAT AAA ATC GAT CCA 

He Asrp Lys Lys Thr Mot Leu Glu Val Glu Arg Ala Leu His Lys He Asp Pro 

1467 1476 1485 1494 1503 1512 

ACT ATC ATT CTC TAC CCC GAA CCG TCC GGT GGA TGG GGA GCA CCG ATC AGG TTT 

Thr He He Lou Tyr Gly Glu Pro Trp Gly Gly Trp Gly Ala Pro Ho Arg Phe 

1521 1530 1539 1548 1557 1566 

GGA AAG AGC GAT GTC GCC GCC ACA CAC CTC GCA GCT TTC AAC GAT GAO TTC AGA 

Gly Lys Ser Asp Val Ala Gly Thr His Val Ala Ala Phe Asn Asp Glu Pho Arg 

1575 1584 1593 1602 1611 1620 

GAC GCA ATA AGG GGT TCC OTG TTC AAC CCG AGC CTC AAG GGA TTC CTC ATO GGA 

Asp Ala Ho Arg Gly Ser Val Pho Asn Pro Sor Val Lys Gly Pho Val Met Gly 

Figure 14 (Continued) 
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2169 2178 21B7 2196 2205 2214 

ATT TAC AAT OGA AAC TTA GAO AAG ACA ACA TAC AAA CTG CCA GAA GGA AAA TGG 

lie Tyr Asn Gly Afln Leu Glu Lys Thr Thr Tyr Lys Leu Pro Glu Oly Lye Trp 

2223 2232 2241 2250 2259 2266 

AAT GTO GTT GTC AAC AGC CAG AAA GCC GGA ACA GAA GTO ATA GAA ACC GTC GAA 

Asn Val Val Val Asn 5er Gin Lys Ala Gly Thr Glu Val lie Glu Thr Val Glu 

2277 2286 2295 2304 2313 

GGA ACA ATA GAA CTC GAT CCG CTT TCC GCG TAC GTT CTQ TAC AGA GAG TCA 3* 

Gly Thr He Glu Leu Asp Pro Leu Ser Ala Tyr Val Leu Tyr Arg Glu 



Figure 14 (Continued) 



INTERNATIONAL SEARCH REPORT 


International application No. 
PCT/US97/00092 


A. CLASStnCATION OF SUBJECT MATTER 

1PC(6) :Plcasc Sec Extra Sheet. 

US CL :435/201. 252.33; 536/23.2 
According to intemaiional Patent CUssification (IPC) or lo both national classification and IPC 


B. n ELDS SEARCHED 


Minimum documentation searched (classification system followed by classification symbols) 
U.S. 435/201,252.33 : 536/23.2 


Documentation searched other than minimum documentation to the extent that such documents are included in the fields searched 


Electronic data base consulted during the international search (name of data base and. where practicable, search terms used) 
aps. caplus, biosis 

search terms: glycosidaselsK thermococcus. staphylothermus. pyrococcus 


C. DOCUMENTS CONSIDERED TO BE RELEVANT 


Category* 


Citation of document^ with indication, where appropriate, of the reicvanl passages 


Relevant to claim No. 


X 


VOORHORST et al. Characterization of the celB gene coding 
for >S-glucosidase from the hyperthermophilic archaeon 
Pyrococcus furiosus and its expression and site-directed 
mutation in Escherichia coli. J. Bacteriology. December 
1995, Vol. 177, No. 24, pages 7105-7111 , especially pages 
7105, 7106 and 7108. 


1-9 


Y 


Database CAPLUS on STN, CAS, (Columbus. OH, USA), AN 
1996:106914, KENGEN et al. "An extremely thermostable 
.beta.-glucosidase from the hyperthermophilic archaeon 
Pyrococcus furiosus; a comparison with other glycosidases," 
Biocatalysis 1994, Vol. 11, No. 2, pages 79-88. Abstract. 


1-9 


1 x| Further documents are listed in the continuation of Box C 


1 1 See patent family annex. 




* Special cuegorica of cited documcnU: 

'A' docutneoi deTtniof ibc fencnl •talc of the mn which i* not conudcred 
ao be of p«nicubr rctcwicc 

* E' earlier docojneot pubiiched oo o* mfter tite talematiocud fUiitf d*U 

doTMOwpl which may throw doubto on prioniy cl&iin<i) or which i* 
cited to r»ub4j»h the ptdtiicmixm date of another ciuiioa or oiher 
wptdai rcMOQ (u ^veciTied) 

*0' dnoimrai refcrriof Ut eo or«t diacloeurc. uec, CKhibitsoQ or other 

*P* doTMiTinii pufaUAed prior to ifae ioterertiooai fiiinf dale but later than 
the phoriiy dale claioked 


"P later documeot publtahed after the intctnatiooai ftlinf date or priority 
daU aiKl not in coaflici with the applicatioo but cited to uudcrvtand the 
principle or theory undcrlytng the invottioo 

*X* documem of particular relevance; the claimed ioventioo canoot be 
CDiuidered oovel or canoot be coaiidcred to involve ao invcotjve atcp 
whcc the document ia takca alooe 

*Y* documoil of particular relcvanoe; the claimed iovcotioo rarwH be 
cocuidered to involve aa inventive aiep whca the docuoicat it 
combined widk ooc or more other Micfa documenu. such oombinalkn 
bcini obviotu to a pcraoo tkiUed in tbc art 

docuriKxa n»eaibcr of the same paicot family 


Date of the actual completion of the international search 
29 MARCH 1997 


Date of mailing of the international search report 

0 9 M mi 


Name and mailing address of the ISA/US 
Commiuioncr of PalcMs and Trademarks 

Box per 

Waihinston. D.C. 20231 
Facsimile No. (703) 305-3230 


Authorized officer ^ i/l 

ELIZABETH SLOBODYANSIC^dl^^ yj 
Tc Icphone No . (703 ) 308-0 1 96 ff VC/ 



Form PCT/lSA/210 (second shcct)(JuJy 1992)* 



BNSDCCID: <WO 9725417A1 I > 



INTERNATIONAL SEARCH REPORT 



Inlcmaiional application No. 
PCT/US97/00092 



C (Coniinuaiion). DOCUMENTS CONSIDERED TO BE RELEVANT 



Category* 



x,p 



Cimion of di>cumcnt. with indicalion, where appropriate, of the relevant passaees 



BAUER et al. Comparison of j3-glucosidase and /3-mannosidase 
from the hyperthermophilic archaeon Pyrococcus furiosus. J. Biol. 
Chem. 27 September 1996, Vol. 271, No. 39, pages 23749- 
23755, see entire document. 



Relevant to cbim No. 



1-9 



Form PCT/lSA/210 (coniinuaiion of second $hec»)(July 1992)* 



INTERNATIONAL SEARCH REPORT 



Inlcmaiionat application No. 
PCT/US97/00092 



A. CLASSIFICATION OF SUBJECT MATTER 
IPC (6): 

C12N 9/26. 1/20; C07H 21/04 

BOX II. OBSERVATIONS WHERE UNITY OF INVENTION WAS LACKING 
This ISA found multiple inventions as follows: 

Croup I, claims 1-9, drawn to a DNA. a vector comprising the DNA. a cell transformed with the same and a process 

for producmg a peptide. 

Group II, claim 10, drawn to an enzyme. 

Group III, claim 1 1 . drawn lo a method of use of an enzyme. 

The invcnlions listed as Groups I and II do not relate lo a single invemive concept under PCT Rule 13.1 because, under 
PCT Rule 13.2, Ihcy lack the same or corresponding special technical features for the following reasons- A DNA of 
Group 1 and an enzyme of Group II are different compounds with different chemical structures and differem utilities 
and therefore do not share a special technical feature. The method of Group III uses an enzyme and therefore docs not 
share a special technical feature w,ih Group i. PCT Rule 1 .475(d) docs not provide for the multiple products or 
methods wiihm a smgle application and therefore unity of invention is lacking w.ih regard to groups I II and III 



Fonn PCrnS A/210 (extra sheci)(July 1992)* 



BNSDOCID: <WO_97254l7A1_l_> 



